Minnesota Natural Language Processing Seminar Series: Yoon Kim

Efficient Transfer Learning with Large Language Models

The Minnesota Natural Language Processing (NLP) Seminar is a venue for faculty, postdocs, students, and anyone else interested in theoretical, computational, and human-centric aspects of natural language processing to exchange ideas and foster collaboration. The talks are every other Friday from 12 p.m. - 1 p.m. during the Spring 2022 semester.

This week's speaker, Yoon Kim (MIT), will be giving a talk titled "Efficient Transfer Learning with Large Language Models"


Transfer learning with large pretrained language models is the dominant paradigm in natural language processing. With moderately-sized models (e.g., BERT), transfer learning involves full finetuning to obtain a task-specific model with its own parameters for each task, which makes the approach hard to scale to storage-constrained scenarios. With larger models (e.g., GPT-3), the model is adapted to each task via natural language prompts and thus the pretrained parameters remain fixed. However, few-shot learning capabilities via prompting emerge only when model sizes are large enough, and thus inference remains expensive. This talk explores two approaches for improving the memory- and inference-efficiency of large language models within the transfer learning paradigm. For finetuned models, we show that only a small subset of the model parameters (0.5%) need to be updated to match the performance of fully-finetuned models. For prompted models, we show that co-training (wherein two models are trained on confidently-labeled outputs from each other) can produce much smaller models that outperform the original prompted model.


Yoon Kim is an assistant professor at MIT in the Department of Electrical Engineering and Computer Science. He obtained his PhD from Harvard University, where he was advised by Alexander Rush.

Start date
Friday, May 6, 2022, Noon
End date
Friday, May 6, 2022, 1 p.m.