Colloquium: Reinforcement Learning for Complex Environments: Tree Search, Function Approximators and Markov Games

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Qiaomin Xie (Cornell University), will be giving a talk titled "Reinforcement Learning for Complex Environments: Tree Search, Function Approximators and Markov Games.


Recent literature has witnessed much progress on the algorithmic and theoretical foundations of Reinforcement Learning (RL), particularly for single-agent problems with small state/action spaces. Our understanding and algorithm toolbox for RL under complex environments, however, remain relatively limited. In this talk, I will discuss some of my work on scalable and probably efficient RL for the challenging settings with large spaces and multiple strategic agents.

First, I will focus on simulation-based methods, as exemplified by Monte-Carlo Tree Search (MCTS). MCTS is a powerful paradigm for online planning that enjoys remarkable empirical success, but lacks theoretical understanding. We provide a complete and rigorous non-asymptotic analysis of MCTS. Our analysis develops a general framework based on a hierarchy of bandits, and highlights the importance of using a non-standard confidence bound (also used by AlphaGo) for convergence. I will further discuss combining MCTS with supervised learning and its generalization to continuous action space.

In the second part of the talk, I will discuss on-policy RL for zero-sum Markov games, which generalizes Markov decision processes to multi-agent settings. We consider function approximation to deal with continuous and unbounded state spaces. Based on a fruitful marriage with algorithmic game theory, we develop the first computational efficient algorithm for this setting, with a provable regret bound that is independent of the cardinality and ambient dimension of the state space.  


Qiaomin Xie is a visiting assistant professor in the School of Operations Research and Information Engineering (ORIE) at Cornell. Prior to that, she was a postdoctoral researcher with LIDS at MIT, and was a research fellow at the Simons Institute during Fall 2016. Qiaomin received her Ph.D. degree in Electrical and Computing Engineering from University of Illinois Urbana Champaign, and her B.E. degree in Electronic Engineering from Tsinghua University. Her research interests lie in the fields of stochastic networks, reinforcement learning, computer and network systems. She is the recipient of Google System Research Award 2020, UIUC CSL PhD Thesis Award 2017 and the best paper award from IFIP Performance Conference 2011.

Start date
Friday, April 2, 2021, 11:15 a.m.
End date
Friday, April 2, 2021, 12:15 p.m.

Online - Zoom link
