Robotics 8970 Colloquium: Andrew Lamperski (Fall 2021)

Non-convex learning, system identification, and stabilization for model-free reinforcement learning

In this talk, Dr. Lamperski will first examine the convergence of Langevin algorithms for machine learning and system identification problems with constraints.

Much of machine learning fits model parameters to data via optimization, typically via some variation of stochastic gradient descent. However, in many cases, such as neural network regression, the loss functions are non-convex and stochastic gradient descent can get stuck in local minima, if it even converges. Langevin methods augment standard gradient-based methods with additive noise. In the case of unconstrained problems, it is well-understood how the additive noise helps the algorithm escape undesirable minima. However, many neural network regression and probabilistic estimation problems require constraints. We describe a Langevin method for problems with non-convex losses and convex constraints. We will show how the method provably escapes local minima and converges to the global optima, albeit slowly in the non-convex case. Then, we will show how the method can be applied to problems with correlated data, as arise in identification of parameters of dynamic systems.

Secondly, Dr. Lamperski will describe the problem of model-free learning of stabilizing controllers for linear systems. In recent years, there has been a strong push for understanding the theoretical properties of reinforcement learning problems for simple benchmark optimal control. The simplest optimal control problem with continuous state and action spaces is the linear quadratic regulator. All previous model-free approaches to this problem required knowledge of a stabilizing controller. However, computing this stabilizing controller is typically the most important part of the design process. We will describe an algorithm based on Q-learning that can find a stabilizing controller and then optimize it. It can be applied online to a single trajectory or offline on a fixed data-set.

About Dr. Andrew Lamperski

Dr. Andrew Lamperski received the B.S. in Biomedical Engineering and Mathematics in 2004 from Johns Hopkins University and a Ph.D. in Control and Dynamical Systems in 2011 from the California Institute of Technology.

He held postdoctoral positions in ontrol and dynamical systems at the California Institute of Technology from 2011 - 2012 and in mechanical engineering at The Johns Hopkins University in 2012. From 2012 - 2014, Lamperski did postdoctoral work in the Department of Engineering, University of Cambridge, on a scholarship from the Whitaker International Program. In 2014, he joined the Department of Electrical and Computer Engineering, University of Minnesota as an Assistant Professor.

His research interests include optimal control and machine learning, with applications to neuroscience and robotics.

Robotics 8970 Colloquium: Andrew Lamperski (Fall 2021)

Non-convex learning, system identification, and stabilization for model-free reinforcement learning

About Dr. Andrew Lamperski

Share