Understanding Mode Connectivity via Parameter Space Symmetry
Abstract
It has been observed that the global minimum of neural networks is connected by curves on which train and test loss is almost constant. This phenomenon, often referred to as mode connectivity, has inspired various applications such as model ensembling and fine-tuning. However, despite empirical evidence, a theoretical explanation is still lacking. We explore the connectedness of minimum through a new approach, parameter space symmetry. By relating topology of symmetry groups to topology of minima, we provide the number of connected components of full-rank linear networks. In particular, we show that skip connections reduce the number of connected components. We then prove mode connectivity up to permutation for linear networks. We also provide explicit expressions for connecting curves in minimum induced by symmetry.