CSE DSI Machine Learning Seminar with Benjamin Grimmer (John Hopkins)
Accelerated Gradient Descent via Long Steps
This talk will discuss recent work establishing provably faster convergence rates for gradient descent in smooth convex optimization via semidefinite programming and computer-assisted analysis techniques. We do this by allowing nonconstant stepsize policies with frequent long steps potentially violating descent. This is managed by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We prove a O(1/T^{1.02449}) convergence rate, beating the classic O(1/T) rate simply by periodically including longer steps (no momentum needed!).
Ben Grimmer is an assistant professor at Johns Hopkins in Applied Math and Statistics. He completed his PhD at Cornell, mentored by Jim Renegar and Damek Davis, funded by an NSF fellowship, with brief stints at the Simons Institute and Google Research. Ben's work revolves around building meaningful foundational theory for first-order optimization methods. His research interests span from tackling challenges in nonsmooth/nonconvex/nonLipschitz optimization to developing novel (accelerated) projection-free "radial" methods based on dual gauge reformulations. This talk will just focus on developments on the foundations of classic gradient descent for smooth convex minimization.