UMN Machine Learning Seminar
The UMN Machine Learning Seminar Series brings together faculty, students, and local industrial partners who are interested in the theoretical, computational, and applied aspects of machine learning, to pose problems, exchange ideas, and foster collaborations. The talks are every Thursday from 12 p.m. - 1 p.m. during the Summer 2021 semester.
This week's speaker, Weijie Su (Wharton Statistics Department, University of Pennsylvania) will be giving a talk titled "Local Elasticity: A Phenomenological Approach Toward Understanding Deep Learning."
Weijie Su is an Assistant Professor in the Wharton Statistics Department and in the Department of Computer and Information Science, at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning. Prior to joining Penn, he received his Ph.D. from Stanford University in 2016 and his bachelor’s degree from Peking University in 2011. His research interests span machine learning, optimization, privacy-preserving data analysis, and high-dimensional statistics. He is a recipient of the Stanford Theodore Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, and an Alfred Sloan Research Fellowship in 2020.
Motivated by the iterative nature of training neural networks, we ask: If the weights of a neural network are updated using the induced gradient on an image of a tiger, how does this update impact the prediction of the neural network at another image (say, an image of another tiger, a cat, or a plane)? To address this question, I will introduce a phenomenon termed local elasticity. Roughly speaking, our experiments show that modern deep neural networks are locally elastic in the sense that the change in prediction is likely to be most significant at another tiger and least significant at a plane, at late stages of the training process. I will illustrate some implications of local elasticity by relating it to the neural tangent kernel and improving on the generalization bound for uniform stability. Moreover, I will introduce a phenomenological model for simulating neural networks, which suggests that local elasticity may result from feature sharing between semantically related images and the hierarchical representations of high-level features. Finally, I will offer a local-elasticity-focused agenda for future research toward a theoretical foundation for deep learning.