CSE DSI Machine Learning Seminar with Stephan Rabanser (Princeton)

Towards a Science of AI Agent Reliability

AI agents are increasingly performing consequential tasks autonomously: writing code, making purchases, and providing advice. But how do we know when to trust them? Current evaluation focuses predominantly on success rates: how often does the agent complete the task? This misses critical questions about how agents behave: Do they give the same answer twice? Do they fail gracefully when conditions change? Can they tell us when they’re likely to be wrong? Drawing on decades of practice from aviation, nuclear power, and other safety-critical domains, we propose a framework that decomposes reliability into four dimensions: consistency, robustness, predictability, and safety. Evaluating 12 frontier AI models, we find a striking result: despite rapid capability improvements over 18 months, reliability has barely budged. Agents that are substantially more accurate remain inconsistent across runs and poorly calibrated about their own uncertainty. The implication is clear: building capable AI is not the same as building dependable AI. As agents take on higher-stakes tasks, we need evaluation practices that ask not just “does it work?” but “can we count on it?”

Stephan Rabanser is a Postdoctoral Research Associate at Princeton University's Center for Information Technology Policy (CITP), where he works with Professors Arvind Narayanan and Matthew Salganik. His research focuses on trustworthy machine learning — particularly uncertainty quantification, selective prediction, and out-of-distribution robustness — with a current emphasis on designing uncertainty mechanisms for large generative models to support safer deployment and more reliable decision-making.

Stephan holds a Ph.D. in Computer Science from the University of Toronto, where he was affiliated with the Vector Institute and advised by Nicolas Papernot. He also holds an M.Sc. and B.Sc. in Informatics from the Technical University of Munich (TUM) and an Honours Degree in Technology Management from CDTM. He has held research and engineering positions at Amazon/AWS AI Labs and Google, and has been a research visitor at MIT, Carnegie Mellon, and the University of Cambridge.

CSE DSI Machine Learning Seminar with Stephan Rabanser (Princeton)

Towards a Science of AI Agent Reliability

Share