Cray Distinguished Speaker: ML-Powered Diagnosis of Performance Anomalies in Computer Systems

The computer science colloquium takes place on Mondays from 11:15 a.m. - 12:15 p.m. This week's speaker, Ayse K. Coskun (Boston University), will be giving a talk titled, "ML-Powered Diagnosis of Performance Anomalies in Computer Systems".

Abstract

Today’s large-scale computer systems that serve high performance computing and cloud face challenges in delivering predictable performance, while maintaining efficiency, resilience, and security. Much of computer system management has traditionally relied on (manual) expert analysis and policies that rely on heuristics derived based on such analysis. This talk will discuss a new path on designing ML-powered “automated analytics” methods for large-scale computer systems and how to make strides towards a longer-term vision where computing systems are able to self-manage and improve. Specifically, the talk will first cover how to systematically diagnose root causes of performance “anomalies”, which cause substantial efficiency losses and higher cost. Second, it will discuss how to identify applications running on computing systems and discuss how such discoveries can help reduce vulnerabilities and avoid unwanted applications. The talk will also highlight how to apply ML in a practical and scalable way to help understand complex systems, demonstrate methods to help standardize study of performance anomalies, discuss explainability of applied ML methods in the context of computer systems, and point out future directions in automating computer system management.

Biography

Prof. Ayse K. Coskun is a full professor at Boston University (BU) at the Electrical and Computer Engineering Department, where she leads the Performance and Energy Aware Computing Laboratory (PeacLab) to solve problems towards making computer systems more intelligent and energy- efficient. Coskun is also the Director of the Center for Information and Systems Engineering (CISE). Coskun’s research interests intersect design automation, computer systems, and architecture. Her research outcomes are culminated in several technical awards, including the NSF CAREER Award, the IEEE CEDA Ernest Kuh Early Career Award, and an IBM Faculty Award. Coskun has been an avid collaborator of industry (including with IBM TJ Watson, Oracle, AMD, Intel, and others) and received several patents during her time at Sun Microsystems (now Oracle). Her research team has released several impactful open-source software artifacts and tools to the community. Coskun has also regularly participated in outreach programs at BU and founded a new forum called “Advancing Diversity in EDA” (DivEDA). She currently serves as the Deputy Editor-in-Chief of the IEEE Transactions on Computer Aided Design. Coskun received her PhD degree in Computer Engineering from University of California San Diego.

Start date
Monday, Oct. 28, 2024, 11:15 a.m.
End date
Monday, Oct. 28, 2024, 12:15 p.m.
Location

Share