In-context learning of solution operators to linear elliptic PDEs
Data Science Seminar
Frank Cole
University of Minnesota
Abstract
Transformer-based neural networks have demonstrated the ability learn in-context: given a few demonstrations of a new task, they can make correct predictions without updating their parameters. Recently, transformer-based foundation models have emerged as a powerful tool for solving complex scientific problems such as partial differential equations. However, their theoretical underpinnings are largely underdeveloped. In this talk, we present a framework to solve linear elliptic PDEs in-context with transformer neural networks, which consists of i) discretizing the infinite-dimensional PDE problem to a finite-dimensional linear system, and ii) in-context learning the linear system with transformers. We derive generalization bounds for the PDE recovery error respect to the number of pre-training tasks, the prompt lengths during training and inference, and the size of the discretization. We also study the behavior of pre-trained transformers under shifts on the distribution of tasks. Specifically, we introduce a novel definition of task diversity and prove that it is a sufficient condition for pre-trained transformers to generalize under task distribution shifts. In addition, we provide several sufficient conditions for task diversity to hold. Numerical experiments validate our theoretical findings.