CS&E Colloquium: Co-Designing Algorithms and Hardware for Efficient Machine Learning (ML): Advancing the Democratization of ML
The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m. This week's speaker, Caiwen Ding (University of Connecticut), will be giving a talk titled, "Co-Designing Algorithms and Hardware for Efficient Machine Learning (ML): Advancing the Democratization of ML".
Abstract
The rapid deployment of ML has witnessed various challenges such as prolonged computation and high memory footprint on systems. In this talk, we will present several ML acceleration frameworks through algorithm-hardware co-design on various computing platforms. The first part presents a fine-grained crossbar-based ML accelerator. Instead of attempting to map the trained positive/negative weights
afterwards, our key principle is to proactively ensure that all weights in the same column of a crossbar have the same sign, to reduce area. We divide the crossbar into sub-arrays, providing a unique opportunity for input zero-bit skipping. Next, we focus on co-designing Transformer architecture, and introduce on-the-fly attention and attention-aware pruning to significantly reduce runtime latency. Then, we will focus on co-design graph neural network training. To explore training sparsity and assist explainable ML, we propose a hardware friendly MaxK nonlinearity, and tailor a GPU kernel. Our methods outperform the state-of-the-arts on different tasks. Finally, we will discuss today's challenges related to secure edge AI and large language models (LLMs)-aided agile hardware design, and outline our research plans aimed at addressing these issues.
Biography
Caiwen Ding is an assistant professor in the School of Computing at the University of Connecticut (UConn). He received his Ph.D. degree from Northeastern University, Boston, in 2019, supervised by Prof. Yanzhi Wang. His research interests mainly include efficient embedded and high-performance systems for machine learning, machine learning for hardware design, and efficient privacy-preserving machine learning. His work has been published in high-impact venues (e.g., DAC, ICCAD, ASPLOS, ISCA, MICRO, HPCA, SC, FPGA, Oakland, NeurIPS, ICCV, IJCAI, AAAI, ACL, EMNLP). He is a recipient of the 2024 NSF CAREER Award, Amazon Research Award, and CISCO Research Award. He received the best paper nomination at 2018 DATE and 2021 DATE, the best paper award at the DL-Hardware Co-Design for AI
Acceleration (DCAA) workshop at 2023 AAAI, outstanding student paper award at 2023 HPEC, publicity paper at 2022 DAC, and the 2021 Excellence in Teaching Award from UConn Provost. His team won first place in accuracy and fourth place overall at the 2022 TinyML Design Contest at ICCAD. He was ranked among Stanford’s World’s Top 2% Scientists in 2023. His research has been mainly funded by NSF, DOE,
DOT, USDA, SRC, and multiple industrial sponsors.