Learning in the presence of low-dimensional structure: a spiked random matrix perspective
Data Science Seminar
Denny Wu (New York University)
Abstract
Real-world learning problems are often high-dimensional but also exhibit certain low-dimensional structures. We study the performance of (i) kernel methods, (ii) neural networks optimized via gradient descent, when the low-dimensionality is encoded in two ways: 1. the target function is a single-index model defined by an unknown link function applied to a one-dimensional projection of the input; 2. the input features are drawn from a spiked covariance model which describes a low-dimensional signal (spike) "hidden" in high-dimensional noise (bulk). We characterize the interplay between structured data (the extent of input anisotropy, as well as the overlap between the input spike and the target direction) and the sample complexity of the learning algorithms, and show that both kernel ridge regression and neural network benefit from low-dimensional structure, but neural network can adapt to such a structure more effectively due to feature learning.
Based on joint works with Jimmy Ba, Murat A. Erdogdu, Alireza Mousavi-Hosseini, Taiji Suzuki, and Zhichao Wang.