CSE DSI Machine Learning Seminar with Jason Lee (Princeton University)

Canceled due to illness

Learning Representations and Associations with Gradient Descent

Machine Learning has undergone a paradigm shift with the success of pretrained models. Pretraining models via gradient descent learns transferable representations that adapt to a wide swath of downstream tasks. However, significant prior theoretical work has demonstrated that in many regimes, overparametrized neural networks trained by gradient descent behave like kernel methods, and do not learn transferable representations. In this talk, we close this gap by demonstrating that there is a large class of functions which cannot be efficiently learned by kernel methods but can be easily learned with gradient descent on a neural network by learning representations that are relevant to the target task. We also demonstrate that these representations allow for efficient transfer learning, which is impossible in the kernel regime.

Finally, we will demonstrate how pretraining learns associations for in-context learning with transformers. This leads to a systematic and mechanistic understanding of learning causal structures including the celebrated induction head identified by Anthropic.

Jason Lee is an associate professor in Electrical Engineering and Computer Science (secondary) at Princeton University. Prior to that, he was in the Data Science and Operations department at the University of Southern California and a postdoctoral researcher at UC Berkeley working with Michael I. Jordan. Jason received his PhD at Stanford University advised by Trevor Hastie and Jonathan Taylor. His research interests are in the theory of machine learning, optimization, and statistics. Lately, he has worked on the foundations of deep learning, representation learning, and reinforcement learning. He has received the Samsung AI Researcher of the Year Award, NSF Career Award, ONR Young Investigator Award in Mathematical Data Science, Sloan Research Fellowship, NeurIPS Best Student Paper Award and Finalist for the Best Paper Prize for Young Researchers in Continuous Optimization.

Start date
Tuesday, Oct. 15, 2024, 11 a.m.
End date
Tuesday, Oct. 15, 2024, Noon
Location

Virtual seminar via Zoom; can be viewed in Keller Hall 3-180

Share