Colloquium: Seeking Efficiency and Interpretability in Deep Learning

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Hao Li (Amazon Web Services AI, Seattle), will be giving a talk titled "Seeking Efficiency and Interpretability in Deep Learning".

Abstract

The empirical success of deep learning in many fields, especially Convolutional Neural Networks (CNNs) for computer vision tasks, is often accompanied with significant computational cost for training, inference and hyperparameter optimization (HPO). Meanwhile, the mystery of why deep neural networks can be effectively trained with good generalization remains not fully unveiled. With the pervasive application of deep neural networks for critical applications on both cluster and edge devices, there is a surge in demand for efficient, automatic and interpretable model inference, optimization, and adaptation.

In this talk, I will present techniques we developed for reducing neural nets’ inference and training cost, with better understanding about their training dynamics and generalization ability. I will begin by introducing the filter pruning approach for accelerating the inference of CNNs and exploring the possibility of training quantized networks on devices with hardware constraints. Then, I will present how the loss surface of neural networks can be properly visualized with filter-normalized directions, which enables meaningful side-by-side comparisons of generalization ability of neural nets trained in different architectures or hyperparameters. Finally, I will revisit the common practices of HPO for transfer learning tasks. By identifying the correlation among hyperparameters and the connection between task similarity and optimal hyperparameters, the black-box hyperparameter search process can be whitened and expedited.

Biography

Hao Li is an Applied Scientist at Amazon Web Services AI, Seattle, where he researches and develops efficient and automatic machine learning for the cloud-based image and video analysis service - Rekognition. He has contributed to the launch of new vertical services including Custom Labels, Content Moderation and Lookout. Before joining AWS, he received his PhD in Computer Science from University of Maryland, College Park, advised by Prof. Hanan Samet and Prof. Tom Goldstein. His research lies at the intersection of machine learning, computer vision and distributed computing, with a focus on efficient, interpretable and automatic machine learning on platforms ranging from high performance cluster to edge devices. His notable research contribution includes the first filter pruning method for accelerating CNNs/ResNets and the loss surface visualization for understanding the generalization of neural nets.

His work on the trainability of quantized networks received Best Student Paper Award at ICML’17 Workshop on Principled Approaches to Deep Learning.

Colloquium: Seeking Efficiency and Interpretability in Deep Learning

Abstract

Biography

Share