Meet the Faculty - Dongyeop Kang
Tell us about your journey to the University of Minnesota.
I got my Ph.D. from Carnegie Mellon University in 2020 and did a post-doctoral position at University of California Berkeley for one year. I was in the job market and the University of Minnesota was one of the best choices for me. There are really great collaborators at the U of M in human-computer interaction, robotics, data mining and many other fields. I have really enjoyed working with other faculty in the Department of Computer Science and Engineering, as well as with the students here. It is an exciting opportunity to work at the University and the Department is really supportive of my work.
How did you become interested in computer science and your specific field?
My main research area is Natural Language Processing (NLP). The core of the field is understanding how people communicate with each other and the semantics behind the language people use in order to make artificial intelligence (AI) or computer science systems that can better communicate with humans. Recently, this has been an area of interest with the rise of ChatGPT and GPT-4. The original purpose was to help people improve their writing and search related activities.
When I started my Ph.D., I was interested in machine learning and deep learning. Language processing was the only area I couldn’t easily “solve” or learn the semantics behind it. At the time, NLP was a more difficult thing for machines to understand. That is what piqued my interest and it became the focus of my Ph.D. work. My research aimed to make more coherent, logical and interpersonal-aware outputs from this AI system, similar to what ChatGPT does today.
Current AI systems are lacking in interpersonal human interactions - they produce text, but there is no actual communication going on. My post-doc study focused on making AI systems more aware and collaborative with humans. For example, when you are reading difficult papers or you are writing long, scientific texts, the system can help identify definitions in the paper or help flag errors or inconsistencies in your writing. It is more than correcting typos; the aim is to provide actual writing assistance for flow and cohesiveness.
Tell us more about your current research!
My research at the University has expanded from the original work, but the core is still the same. I work on improving collaborative NLP systems. We are currently studying how people revise drafts of papers. We are teaching the system to revise papers to improve the performance and qualify of the text. So this iterative process goes back and forth between the human and machine to provide recommendations for revisions and use the feedback to continue making better suggestions. The goal is to have the process be a collaboration between the person and the machine.
Another project we are working on has to do with personalization. ChatGPT and GPT-4 can do near-perfect human performance in down-stream tasks, but it still lacks in making more individualized predictions. The models are trained to aggregate billions of people’s opinions into one answer. They are not able to handle more personalized language depending on the audience. For example, if the system is talking to a child, it would need to use less technical language. The interpersonal context is important to improve these models and hopefully we can help with social reasoning.
Privacy is another important factor when we are dealing with open-source AI. So that is another focus for our work. We are also looking at how these models handle diverse thoughts. The current AI models treat disagreeing thoughts as outliers and filter them out. Our system is trying to take on a distribution of opinions instead of aggregating only the majority voices. We want to make sure that diverse thought and voices are being represented in some way and sending the right signals to our system to make the output more representative and robust.
What do you hope to accomplish with this work? What is the real-world impact for the average person?
Most of my work is related to real-world applications. The reading assistant system that I work on is already plugged into Semantic Scholar, which is similar to Google Scholar. They use our augmented PDF reading interface that can help define terms, citations and other information within a paper. We also built a writing system for scientific writing for researchers and scientists. It helps writers with consistency and coherence, and can even autocomplete some parts of the paper based on feedback.
We actively collaborate with Google and Amazon and we hope our diversity aware system can be deployed on top of search engines, AI like ChatGPT and GPT-4 and other established sites to improve their output.
What can students expect to get out of your courses?
I teach a NLP course, CSCI-5541. Students learn about basic to advanced ways to represent language as an input in an NLP system. Once they learn how to decode the semantics of different language processes, they work on utilizing the system for specific tasks. There are a variety of ways this can be applied - dialogue systems, translation systems, detection systems. Students learn how to represent the language input and then come up with practical applications.
My course is very practical and we look at the latest techniques and open source libraries. All homework is practice based and my goal is to teach students a hard skill that they can use in their own research or in a future job. I'm also planning to create a new class focusing on Large Language Models (LLMs) in the following years.
What do you enjoy most about teaching?
At the beginning of the class, 80% of the students will not know anything about NLP. During the second half of the class, students get to the point where they can debate topics and critique applications and projects. That is always a joyful moment for me. I like seeing that understanding develop and watching students help each other and figure out techniques. Everyone comes out with a unique perspective of NLP and that is the most enjoyable part for me.
What do you do outside of the classroom for fun?
I like to check out the different lakes in Minnesota. My favorite lake is Bde Maka Ska. My wife and I go there to walk and jog. I also enjoy playing tennis with my colleagues and friends. I love the trails by the Mississippi River and the Stone Arch Bridge Area.
I also like finding good restaurants on campus. We also like traveling locally to Wisconsin and Chicago and finding good Korean food. We plan to explore more winter activities next year.