Cray Distinguished Speaker: OLMo, Tülu, and Friends: Accelerating the Science of Language Modeling
The computer science colloquium takes place on Mondays from 11:15 a.m. - 12:15 p.m. This week's speaker, Noah Smith (University of Washington), will be giving a talk titled, "OLMo, Tülu, and Friends: Accelerating the Science of Language Modeling".
Abstract
Neural language models with billions of parameters and trained on trillions of words are powering the fastest-growing computing applications in history and generating discussion and debate around the world. Yet most scientists cannot study or improve those state-of-the-art models because the organizations deploying them keep their data and machine learning processes secret. I believe that the path to models that are usable by all, at low cost, customizable for areas of critical need like the sciences, and whose capabilities and limitations are made transparent and understandable, is radically open development, with academic and not-for-profit r searchers empowered to do reproducible science. In this talk, I’ll share the story of the work our team is doing to radically open up the science of language modeling. This year, we've releasedmultiple iterations of OLMo, a strong language model with fully open pretraining data, including a strong mixture-of-experts model, OLMoE. From these we also built Molmo, an open la guage-vision model. We’ve also built and released Tülu, a series of models that systematically explore the post-training landscape. All of these come with open-source code and extensive documentation, including new tools for evaluation. Together these artifacts make it possible to explore new scientific questions and democratize control of the future of this fascinating and important technology.
The work I’ll present was led by a large team at the Allen Institute for Artificial Intelligence in Seattle, with collaboration from the Paul G. Allen School at the University of Washington and various kinds of support and coordination from many organizations, including the Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, AMD, CSC - IT Center for Science (Finland), Databricks, Together.ai, and the National AI Research Resource Pilot.
Biography
Noah Smith is a computer scientist working in several fields of artificial intelligence research. He recently wrote Language Models: A Guide for the Perplexed, a general-audience tutorial, and he co-directs the OLMo open language modeling effort with Hanna Hajishirzi.
Broadly, his research targets algorithms that process data encoding language, music, and more, to augment human capabilities. He also works on core problems of research methodology like evaluation. You can watch videos of some of his talks, read his papers, and learn about his research groups, Noah’s ARK and AllenNLP. Smith is most proud of his mentoring accomplishments: as of 2024, he has graduated 29 Ph.D. students and mentored 15 postdocs, with 27 alumni now in faculty positions around the world. 20 of his undergraduate/masters mentees have gone on to Ph.D. programs. His group’s alumni have started companies and are technological leaders both inside and outside the tech industry.