UMN Machine Learning Seminar: Attention is not all you need

The UMN Machine Learning Seminar Series brings together faculty, students, and local industrial partners who are interested in the theoretical, computational, and applied aspects of machine learning, to pose problems, exchange ideas, and foster collaborations. The talks are every Wednesday from 12 p.m. - 1 p.m. during the Spring 2022 semester.

This week's speaker, Yihe Dong (Google), will be giving a talk titled "Attention is not all you need."

Abstract

I will be talking about our recent work on better understanding attention. Attention-based architectures have become ubiquitous in machine learning, yet our understanding of the reasons for their effectiveness remains limited. We show that self-attention possesses a strong inductive bias towards "token uniformity". Specifically, without skip connections or multi-layer perceptrons (MLPs), the output converges doubly exponentially to a rank-1 matrix. On the other hand, skip connections and MLPs stop the output from degeneration. Along the way, we develop a useful decomposition of attention architectures. This is joint work with Jean-Baptiste Cordonnier and Andreas Loukas.

Biography

Yihe Dong is a machine learning researcher and engineer at Google, with interests in geometric deep learning and natural language processing.

UMN Machine Learning Seminar: Attention is not all you need

Abstract

Biography

Share