Meet the Faculty - Zirui (Ray) Liu
Tell us about your journey to the University of Minnesota.
I didn’t study computer science in my undergraduate or master’s work. My undergraduate and master’s degrees focused on control science, which studies how to control a system like a motor to move a device along a trajectory. In 2015, TenserFlow 1.0 had just come out and I tried to use generative adversarial networks to generate images on my laptop (although I was not that good at it at that time). During my master's studies, I spent a lot of my spare time studying coding, from computer vision to GPU programming. After that, I decided to go into a computer science PhD program. That’s how I shifted over to computer science.
I actually knew a lot about the Department of Computer Science & Engineering at the University of Minnesota before I even came to interview for this position. My major advisor is in the field of data mining, and there are a lot of big names at the U of M in that field, like Vipin Kumar and John Reidl. In my first year of the PhD program, I worked on recommender systems with the MovieLens dataset, which was developed by the GroupLens Lab. I also worked on training machine learning models on huge graphs, and utilized Metis, which was developed by George Karypis, to partition these graphs. Once I got the interview opportunity, I was really excited about the opportunity to work with some of the greatest computer science researchers in the world.
We would love to hear more about your research!
As Wikipedia states: "In computer science, the analysis of algorithms is the process of finding the computational complexity of algorithms—the amount of time, storage, or other resources needed to execute them.” My research builds on this concept, but focuses specifically on scaling machine learning models - both scaling up and scaling down. Scaling down means compressing a large model to a smaller model to reduce the resource needed to execute them. Scaling up means combining the computation power of hardware to train a huge model. I design algorithms and I also study how to implement them using the hardware features from the off-the-shelf hardwares, like tensor cores in GPUs. I am also very enthusiastic about the recent advances with large language models and the fundamental issues with them, because they have lots of counterintuitive behaviors compared to more traditional machine learning models.
What do you hope to accomplish with this work? What is the real-world impact for the average person?
I think the direct impact of my research is to bring powerful artificial intelligence (AI) models to everyone. Some of my work has already been integrated into popular open source packages, like Llama.cpp, Keras, CogDL, and HuggingFace Transformer. This is my biggest achievement so far and I love to see that people are utilizing my work to deploy open-sourced AI models in their own DIY projects. Typically, these models require a lot of expensive GPUs and are not realistically available from most individual researchers. Now with these developed techniques, you can deploy models on your cell phone or laptop to build new applications and reduce the costs for providing your businesses.
What courses are you teaching next spring? What can students expect to get out of that class?
I plan to teach a topic course on system support for large language models, or introduction to large language models. I have not decided yet, but I would like to touch on the inner mechanisms and opportunities in these AI models. After my course, students would be expected to know about the critical components of these AI models, as well as how industry builds the system to handle the huge numbers of user requests.
What do you do outside of the classroom for fun?
I am a huge fan of board games like Warhammer 40K and Dungeons and Dragons. I also like playing legos and video games.
Do you have a favorite spot in the city?
I like the area in St. Paul by the cathedral and Summit Avenue. It is a really nice spot to walk. My favorite restaurant that I have explored in the city is Samarkand, an Uzbekistan restaurant.
Is there anything else you would like students to know about you or your work?
My research style is coding heavy. I love to do coding in my spare time, and many of my papers are from exploring the inner mechanisms and oddities of these machine learning models. Then we propose some strategies to fix these irregularities. My way of understanding ML models is to design controlled experiments to test out a hypothesis - much like how Mendel conducted his pea experiments in biology. I hope my students can learn how to rigorously design and implement controlled experiments in a systematic way.