UMN Machine Learning Seminar

The UMN Machine Learning Seminar Series brings together faculty, students, and local industrial partners who are interested in the theoretical, computational, and applied aspects of machine learning, to pose problems, exchange ideas, and foster collaborations. The talks are every Thursday from 12 p.m. - 1 p.m. during the Fall 2021 semester.

This week's speaker is Tuo Zhao (Georgia Tech).

Abstract

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. When we only have limited supervision for the downstream tasks, however, due to the extremely high complexity of pre-trained models, aggressive fine-tuning often causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data.

To address such a concern, we propose a new approach for fine-tuning of pretrained models to attain better generalization performance. Our proposed approach adopts three important ingredients: (1) Smoothness-inducing adversarial regularization, which effectively controls the complexity of the massive model; (2) Bregman proximal point optimization, which is an instance of trust-region algorithms and can prevent aggressive updating; (3) Differentiable programming, which can mitigate the undesired bias induced by conventional adversarial training algorithms. Our experiments show that the proposed approach significantly outperforms existing methods in multiple NLP tasks. In addition, our theoretical analysis provides some new insights of adversarial training for improving generalization.

Biography

Tuo Zhao is an assistant professor at Georgia Tech. He received his Ph.D. degree in Computer Science at Johns Hopkins University. His research mainly focuses on developing methodologies, algorithms and theories for machine learning, especially deep learning. He is also actively working in neural language models and open-source machine learning software for scientific data analysis. He has received several awards, including the winner of INDI ADHD-200 global competition, ASA best student paper award on statistical computing, INFORMS best paper award on data mining and Google faculty research award.

Start date
Thursday, Sept. 9, 2021, 2:30 p.m.
End date
Thursday, Sept. 9, 2021, 3:30 p.m.
Location

Share