Introducing principles of synaptic integration in the optimization of deep neural networks
Abstract
Plasticity circuits in the brain are known to be influenced by the distribution of the synaptic weights through the mechanisms of synaptic integration and local regulation of synaptic strength. However, the complex interplay of stimulation-dependent plasticity with local learning signals is disregarded by most of the artificial neural network training algorithms devised so far. Here, we propose a novel biologically inspired optimizer for artificial and spiking neural networks that incorporates key principles of synaptic plasticity observed in cortical dendrites: GRAPES (Group Responsibility for Adjusting the Propagation of Error Signals). GRAPES implements a weight-distribution-dependent modulation of the error signal at each node of the network. We show that this biologically inspired mechanism leads to a substantial improvement of the performance of artificial and spiking networks with feedforward, convolutional, and recurrent architectures, it mitigates catastrophic forgetting, and it is optimally suited for dedicated hardware implementations. Overall, our work indicates that reconciling neurophysiology insights with machine intelligence is key to boosting the performance of neural networks.
Authors’ notes
We introduce GRAPES (i.e. Group Responsibility for Adjusting the Propagation of Error Signals), an optimization strategy that relies on the notion of node importance in propagating the error information during learning. The node importance identifies the neurons that have a large number of strong connections and therefore are responsible for substantially amplifying or attenuating the input signal during its propagation to the downstream layers. The underlying concept of node importance is inspired by the process of synaptic integration in biological circuits. This is a non-linear mechanism through which the dendritic branches receiving input from multiple strong connections have a higher probability of boosting the incoming signal as it travels to the soma, compared to dendritic branches with on average weak incoming connections.
By translating this concept to artificial neural networks, GRAPES, as its name indicates, exploits the collective information of the multiple connections of a neuron to modulate the parameter updates during learning — that is, synaptic plasticity is influenced by the distribution of the weights within layers. Our results demonstrate that GRAPES not only provides substantial improvements in the performance of deep artificial and spiking neural networks, but it also mitigates the accuracy degradation due to catastrophic forgetting, which prevents the successful reproduction of continual learning like the one occurring in the human brain.
These results open a new avenue towards narrowing the gap between backpropagation and biologically plausible learning schemes.
Catastrophic forgetting
Catastrophic forgetting refers to the phenomenon affecting neural networks by which the process of learning a new task causes a sudden and dramatic degradation of the knowledge previously acquired by the system. This represents a key limitation of current AI systems, that struggle to reproduce continual learning. Compared to the current solutions, we show that the application of GRAPES mitigates the effects of catastrophic forgetting without introducing additional training steps, such as replaying past.
We suggest that these properties stem from the fact that GRAPES effectively combines in the error signal information related to the response to the current input with information on the internal state of the network, independent of the data sample.
Harvesting GRAPES for neuro-inspired AI
Our optimization method enables faster learning of a wide range of training algorithms and at the same time, mitigates the phenomenon of catastrophic forgetting. For this reason, our optimizer could have a beneficial impact on applications requiring fast learning in a context where new inputs are presented in a non-uniform distribution, for example: the learning of a robot exploring a new environment.
Additionally, biologically plausible training schemes such as feedback alignment greatly benefit from our optimization method, therefore hardware devices that cannot support backpropagation, but only feedback alignment, are a promising area for application of our work.
Furthermore, GRAPES improves the performance of spiking neural networks (SNNs). SNNs offer an energy-efficient alternative for implementing deep learning applications; however, they still lag behind artificial neural networks (ANNs) in terms of accuracy. Our work paves the way for biologically inspired algorithms to narrow the gap between the performance of SNNs and ANNs, enabling applications in the rapidly growing field of neuromorphic chips.
Conclusion
Our results demonstrate that GRAPES not only provides substantial improvements in the performance of deep artificial and spiking neural networks, but it also mitigates the accuracy degradation due to catastrophic forgetting.