ISyE Seminar Series: Martin Zubeldia
"Concentration of contractive Stochastic Approximation and applications to Reinforcement Learning"
Presentation by Martin Zubeldia
Assistant Professor, Department of Industrial and Systems Engineering
University of Minnesota
3:30 pm - Seminar
4:30 pm - Reception, cookies and coffee
About the seminar:
We study the concentration behavior of a stochastic approximation algorithm under a contractive operator with respect to an arbitrary norm. We consider two settings where the iterates are potentially unbounded: (1) bounded multiplicative noise, and (2) additive sub-Gaussian noise. We obtain concentration bounds on the convergence errors, and show that these errors have sub-Gaussian tails in the additive noise setting, and super-polynomial tails (faster than polynomial decay) in the multiplicative noise setting. Moreover, our bounds hold anytime in the sense that the entire sample path lies within a tube of decaying radius with high probability.
To demonstrate the applicability of our theoretical results, we use them to provide anytime high probability bounds for a large class of reinforcement learning algorithms, including but not limited to on-policy TD-learning with linear function approximation, off-policy TD-learning with generalized importance sampling factors, and Q-learning. To the best of our knowledge, super-polynomial concentration bounds for off-policy TD-learning have not been established in the literature due to the challenge of handling the combination of unbounded iterates and multiplicative noise.
Zubeldia is an Assistant Professor in the department of Industrial and Systems Engineering (ISyE) at the University of Minnesota. Before joining the University of Minnesota, he spent a year as a Postdoctoral Fellow in the department of Industrial and Systems Engineering at the Georgia Institute of Technology, and two years as a Postdoc in the department of Mathematics and Computer Science at the Eindhoven University of Technology and in the Korteweg-de Vries Institute for Mathematics at the University of Amsterdam. He received a PhD in Electrical Engineering from the Massachusetts Institute of Technology (2019), and a M.Sc. in Engineering (2014) and a B.Sc. in Electronics Engineering (2012) from Universidad ORT Uruguay.
His research primarily focuses on using applied probability for the modeling, analysis, and control of large-scale stochastic decision and learning systems. He is particularly interested in exploring the fundamental tradeoffs between performance, efficiency, and scalability that arise in these systems, and in how traditional model-based analysis can be combined with newer data-centric approaches to get the best of both worlds. His research has been recognized as a finalist in the 2019 INFORMS APS Best Student Paper Award, and as a finalist in the 2016 INFORMS George E. Nicholson Student Paper Competition.