CS&E Colloquium: Minjia Zhang
The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m. More details about the spring 2023 series will be provided at the beginning of the semester. This week's speaker, Minjia Zhang (Microsoft Research), will be giving a talk titled "Efficient System and Algorithm Design for Large-Scale Deep Learning Training and Inference".
Abstract
Deep learning models have achieved significantly breakthroughs in the past few years. However, it is challenging to provide efficient computation and memory capabilities for both DNN training and inference, given that the model size and complexities keep increasing rapidly. From the training aspect, it is too slow to train high-quality models on massive data, and large-scale model training often requires complex refactoring of models and access to prohibitively expensive GPU clusters, which are not always accessible to many practitioners. On the serving side, many DL models suffer from long inference latency and high costs, preventing them from meeting latency and cost goals. In this talk, I will introduce my work on tackling efficiency problems in DNN/ML from system, algorithm, and modeling optimizations.
Biography
Minjia Zhang is a Principal Researcher at Microsoft. His primary research focus is efficient AI/ML, with a special emphasis on the intersection of large-scale deep learning training and inference system optimization and novel machine learning algorithms. His research has led to research publications on major computer science conferences, such as top-tier system conferences, including ASPLOS, NSDI, USENIX ATC, and top-tier machine learning conferences, including ICML, NeurIPS, and ICLR. Several of his research results have been transferred to industry systems and products, such as Microsoft Bing, Ads, Azure SQL, Windows, leading to significant latency and capacity improvement.