CSE DSI Machine Learning Seminar with Yu Bai (Salesforce)
Understanding Learning in the Age of Foundation Models
In classical supervised learning, models such as neural networks are primarily used as prediction functions which map a single input to a prediction. Such a paradigm, while standard, does not fully explain the impressive capabilities of modern AI such as transformer-based large language models. This talk offers a refined perspective: Transformers can efficiently implement learning algorithms operating on a sequence of inputs. Under the setting of in-context learning, we show how this perspective allows us to understand the impressive capability of transformers to perform various learning tasks in context, such as (in-context) supervised learning, algorithm selection, and reinforcement learning, and make theoretical predictions of certain internal mechanisms that align with trained transformers in reality.
Yu Bai is currently a Senior Research Scientist at Salesforce AI Research. Yu’s research interest lies broadly in machine learning, such as large language models/foundation models, reinforcement learning, game theory, deep learning theory, and uncertainty quantification. Prior to joining Salesforce, Yu completed his PhD at Stanford University.