Nonconvex Min-Max Bilevel Optimization for Task Robust Meta Learning
Abstract
As the number of learning tasks increases, robustness and adaptation both become the most important criteria for measuring the performance of modern machine learning models. This paper mainly focuses on developing a generic robust bilevel optimization framework with possible applications to meta-learning, transfer learning and continual learning. By leveraging the recent advances of nonconvex minmax optimization, our proposed gradient descent and ascent bilevel optimization (TaRo-BOBA) algorithm is able to extract a task robust latent space to overcome the distributional shift between the training and meta testing data sets. Theoretical analyses show that TaRo-BOBA converges to the first-order stationary point in a rate of $O(\sqrt{n}K^{−2/5})$, where $K$ denotes the total number of iterations and $n$ denotes the number of tasks. To the best of our knowledge, this is the first work that formulates the task robust meta-learning as a minmax bilevel optimization problem and provides a single loop gradient-based algorithm with provable convergence rate guarantees.