Alternating Gradient Descent Ascent for Nonconvex Min-Max Problems in Robust Learning and GANs
Abstract
We study a class of nonconvex-strongly-concave min-max optimization problems. A most commonly used algorithm for such problems in machine learning applications is the class of first-order algorithms where gradient descent and ascent steps are performed simultaneously or alternatively in each step. Despite its great success in practice, its theoretical properties are far from being understood. In fact, not much has been said about its convergence once the convex-concave assumption is absent. This is considerably different from minimization problems where many techniques are available to analyze nonconvex problems. It is not clear that if these techniques can be applied to min-max optimization. Despite the simplicity of this type of first-order methods, its properties are extremely difficult to analyze due to the nonlinear and nonconvex coupling between the maximization and minimization steps.(p)(/p)In this paper, we take a step toward this direction by examining a special class of nonconvex-strongly-concave min-max problems. We show that, with a proper stepsize choice, a simple alternating gradient descent/ascent (AGDA) algorithm would, in fact, converge to a stationary solution with a sublinear rate mathcal{O}left( {1/t} right), where t is the iteration number. We hope our analysis sheds light on future studies on the theoretical properties of relevant machine learning problems.