CS&E Colloquium: Towards Planning in Creative Contexts

The computer science colloquium takes place on Mondays from 11:15 a.m. - 12:15 p.m. This week's speaker, Alexander Spangher (University of Southern California), will be giving a talk titled "Towards Planning in Creative Contexts".

Abstract

Recent modeling innovations that teach large language models (LLMs) how to plan — or break down and solve complex problems into multiple steps — have allowed LLMs to achieve impressive results in domains like mathematical problem-solving and coding. However, tasks in such domains are often characterized by large training datasets and well-defined rewards. Many human-centered tasks, especially creative tasks, occur in contexts where goals and rewards are not as clearly defined and datasets are limited: thus, we lack the means necessary to train models to plan in such settings. In this talk, I will outline a research agenda that can enable us to make progress. I will show three pillars: (1) observing plans: how long-range text modeling can help us make inferences about past human actions based on state-observations (a process known to cognitive psychologists as "emulation, based on end-state observation"); (2) improving plans: how these inferences can help us benchmark LLMs in creative tasks and how hierarchical modeling can help us learn novel planning strategies; and (3) executing plans: how classifier-free guidance, an inference-time technique, can be utilized to help LLMs adhere to complex plans. I will demonstrate these processes in the domain of journalism, with specific focus on the task of helping journalists find sources to support their writing processes.

Biography

Alexander Spangher is a PhD candidate at the University of Southern California advised by Jonathan May and Emilio Ferrara. His research focuses on modeling human decision-making in creative domains, especially in contexts where data is limited and rewards and goals are less clear. He is building out a new domain of learning, called emulation learning, with the goal of training the next generation of reasoning-oriented language models to be more proficient in these domains. His research has been used at technology organizations like OpenAI, Google and EleutherAI. He is also especially passionate about helping journalists and has framed tasks and trained reasoning LLMs to help journalists find stories and sources, structure narratives and track information updates. These tools have been incorporated into newsrooms at the New York Times, Bloomberg and Stanford Big Local News, impacting thousands of journalists; and his work is also informing the next generation of journalistic education at USC Annenberg. His work has received numerous awards including two outstanding paper awards at EMNLP 2024, one spotlight award at ICML 2024, one outstanding paper award at NAACL 2022 and a best paper award at CJ2023; and he has been supported by a 4-year Bloomberg PhD Fellowship. His work is broad: in addition to his work in NLP and computational journalism, he has studied misinformation at Microsoft Research and collaborated with the MIT Plasma Science and Fusion Center to model plasma fusion processes.

CS&E Colloquium: Towards Planning in Creative Contexts

Abstract

Biography

Share