Past events

Colloquium: Incorporating Medical Insight into Machine Learning Algorithms for Learning, Inference, and Model Explanation

The computer science colloquium takes place on Mondays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Kayhan Batmanghelich (University of Pittsburgh), will be giving a talk titled "Incorporating Medical Insight into Machine Learning Algorithms for Learning, Inference, and Model Explanation".

Abstract

The healthcare industry is arriving at a new era where the medical communities increasingly employ computational medicine and machine learning. Despite significant progress in the modern machine learning literature, adopting the new approaches has been slow in the biomedical and clinical research communities due to the lack of explainability and limited data. Such challenges present new opportunities to develop novel methods that address AI's unique challenges in medicine.  
In this talk, we show examples of incorporating medical insight to improve the statistical power of association between various data modalities, design a novel self-supervised learning algorithm, and develop a context-specific model explainer. This general strategy can be employed to integrate other biomedical data, an exciting future research direction discussed briefly.

Biography

Kayhan Batmanghelich is an Assistant Professor of the Department of Biomedical Informatics and Intelligent Systems Program with secondary appointments in the Computer Science Department at the University of Pittsburgh and an adjunct faculty in the Machine Learning Department at the Carnegie Mellon University. He received his Ph.D. from the University of Pennsylvania (UPenn) under the supervision of Prof. Ben Taskar and Prof. Christos Davatzikos. He spent three years as a postdoc in Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. Polina Golland. His research is at the intersection of medical vision, machine learning, and bioinformatics. His group develops machine learning methods that address the interesting challenges of AI in medicine, such as explainability, learning with limited and weak data, and integrating medical image data with other biomedical data modalities. His research is supported by awards NIH and NSF, as well as industry-sponsored projects. 

Colloquium: Analyze and rebuild: Redesigning distributed computing systems for the next killer app

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Ali Anwar (IBM Research - Almaden), will be giving a talk titled "Analyze and rebuild: Redesigning distributed computing systems for the next killer app".

Abstract

Modern data applications such as distributed machine learning are revolutionizing all aspects of computing based scientific discovery. As new applications, algorithms, and techniques are invented, the underlying distributed system platforms supporting these uses face fundamentally new challenges. One of such challenges is the workload dynamicity that renders static and design-time system decisions impractical in supporting ever-changing application needs. Studying the workload characteristics of these applications and making informed design decisions can significantly improve the efficiency of the underlying distributed system or platform that enables such applications. Similarly, the resource and data heterogeneity also play an important role in defining the overall performance of these applications. 

This talk covers two of my projects where performing workload and resource usage analysis enabled us to design better systems. First, I will show how studying the workload characteristics of Docker - the de facto standard for data center containers management, at enterprise scale using IBM production systems enabled us to better deal with workload dynamicity, and create a number of optimizations to improve application performance. Second, I will present how we enhanced the powerful Federated Learning approach in distributed machine learning by making it aware of the underlying platform characteristics, such as resource and data heterogeneity, and show how the heterogeneity can affect the robustness of trained models under adversarial attacks. I will conclude with a discussion of plans for my future research.

Biography

Ali Anwar is a Research Staff Member at IBM Research Almaden Center. He holds a Ph.D. degree in Computer Science from Virginia Tech. In his earlier years he worked as an open-source tools developer (GNU GDB) at Mentor Graphics. His research interest lies at the intersection of systems and machine learning. The overarching goal of his research is to enable efficient and flexible systems for the growing data demands of modern high-end applications running on existing as well as emerging computing platforms. His current ongoing work focuses on distributed machine/federated learning systems and platforms, serverless and microservice-based systems, and efficient storage for Docker containers. 

His research has appeared in a number of premier conferences and workshops in computer systems, AI/ML, and high-performance computing, including USENIX FAST, ATC, HotStorage, ACM/IEEE SC, ACM HPDC, SoCC, AISec [Best Paper Award], and AAAI. He regularly performs professional community services and has served as a program committee member for conferences such as SC, HPDC, ICDCS, CCGrid, and a reviewer for journals like ToS, TPDS, TKDE, TCC and JPDC. He is also an associate editor for Neural Processing Letters. At IBM, he has been recognized as a 2019 Outstanding Research Accomplishment winner for Advancing Adversarial Robustness in AI Models. In 2020, he received two Research Accomplishment awards for his research on Enterprise-Strength Federated Learning for Hybrid Cloud and Edge, and Container Storage. He is also a recipient of Pratt Fellowship awarded by Dept. of Computer Science at Virginia Tech.

Colloquium: Extracting structures from data: The black-box, the manual and the discovered

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Raymond Yeh (University of Illinois at Urbana-Champaign), will be giving a talk titled "Bridging algorithmic and statistical randomness in machine learning".

Abstract

Representing structure in data is at the heart of computer vision and machine learning, i.e., the act of converting raw data into a useful mathematical form. In this talk, I will discuss solutions that are broadly characterized into three themes: the black-box, the manual, and the discovered. First, I will discuss how to use deep generative models to learn structures for face images and its application to image inpainting. Going beyond black-box models, I will explain how to manually impose structures in deep-nets for human pose-regression. Specifically, I will introduce chirality nets, a family of deep-nets that respects left/right symmetry of human poses. Lastly, I will illustrate how to discover pairwise word-to-object structures in the context of textual-grounding and discuss current efforts towards discovering general structures.

Biography

Raymond A. Yeh is a PhD candidate at the University of Illinois at Urbana-Champaign (UIUC) advised by Alexander Schwing, Minh Do, and Mark Hasegawa-Johnson. Previously, he has spent time interning at Google AI and Johns Hopkins University. He is a recipient of the Google PhD Fellowship, the Mavis Future Faculty Fellowship and the Henry Ford II Scholarship. His research interests lie at the intersection of machine learning and computer vision.

Colloquium: Overcoming the User-Provider Divide in Cloud Computing

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Neeraja Yadwadkar (Stanford University), will be giving a talk titled "Overcoming the User-Provider Divide in Cloud Computing".

Abstract

Today, even after more than a decade of the cloud computing revolution, users still do not have predictable performance for their applications, and the providers continue to suffer loss of revenue due to poorly utilized resources. Moreover, the environmental implications of these inefficiencies are dire: Cloud-hosted data centers consume as much power as a city of a million people and emit roughly as much CO2 as the airline industry. Fighting these implications, especially in the post Moore's law era, is crucial.

My work points out that the root of these inefficiencies is the gap between the users and the providers. To overcome this divide, my research brings out two key insights for building systems that render the cloud smart, a cloud that is easy-to-use, adaptive, and efficient. First, we must design interfaces to these systems that are intuitive and expressive for users. Such interfaces should open a dialog between users and providers, allowing users to specify high-level application goals, and transfer the responsibility of making low-level resource management decisions to the providers. This opens an opportunity for providers to optimize the use of their resources while still best aligning with user goals. Second, to make the resource management decisions in an adaptive manner in increasingly complex cloud systems, we must leverage Data-Driven or Machine Learning (ML) models. In doing so, my work uses and develops ML algorithms and studies the challenges that such data-driven models raise in the context of systems: modeling uncertainty, cost of training, and generalizability. In this talk, I will present two systems, INFaaS and PARIS, designed to demonstrate the efficacy of these two key insights. These systems represent key steps towards building a smart cloud: they significantly simplify the use of cloud, improve resource efficiency while meeting user goals.

Biography

Neeraja Yadwadkar is a post-doctoral research fellow in the Computer Science Department at Stanford University, working with Christos Kozyrakis. She is a Cloud Computing Systems researcher, with a strong background in Machine Learning (ML). Neeraja's research focuses on using and developing ML techniques for systems, and building systems for ML. Neeraja graduated with a PhD in Computer Science from the RISE Lab at University of California, Berkeley, where she was advised by Randy Katz and Joseph Gonzalez. Before starting her PhD, she received her masters in Computer Science from the Indian Institute of Science, Bangalore, India, and her bachelors from the Government College of Engineering, Pune.

GroupLens Seminar: How Developers Talk About Personal Data and What It Means for User Privacy: A Case Study of a Developer Forum on Reddit

For this spring 2021 seminar series, GroupLens has invited the author of a recent human-computer interaction paper to come chat about their work.

 

MSSE Online Information Session

Have all your questions about the Master of Science in Software Engineering (MSSE) program answered by attending this online information session.

RSVP now to reserve your spot.

Attendees will be sent a link prior to the event.

Colloquium: Machine learning for large- and small-data biomedical discovery

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Yunan Lou (University of Illinois at Urbana-Champaign), will be giving a talk titled "Machine learning for large- and small-data biomedical discovery".

Abstract

In modern biomedicine, the role of computation becomes more crucial in light of the ever-increasing growth of biological data, which requires effective computational methods to integrate them in a meaningful way and unveil previously undiscovered biological insights. In this talk, I will discuss my research on machine learning for large- and small-data biomedical discovery. First, I will describe a representation learning algorithm for the integration of large-scale heterogeneous data to disentangle out non-redundant information from noises and to represent them in a way amenable to comprehensive analyses; this algorithm has enabled several successful applications in drug repurposing. Next, I will present a deep learning model that utilizes evolutionary data and unlabeled data to guide protein engineering in a small-data scenario; the model has been integrated into lab workflows and enabled the engineering of new protein variants with enhanced properties. I will conclude my talk with future directions of using data science methods to assist biological design and to support decision making in biomedicine.

Biography

Yunan Luo is a Ph.D. student advised by Prof. Jian Peng in the Department of Computer Science, University of Illinois at Urbana-Champaign. Previously, he received his Bachelor’s degree in Computer Science from Tsinghua University in 2016. His research interests are in computational biology and machine learning. His research has been recognized by a Baidu Ph.D. Fellowship and a CompGen Ph.D. Fellowship.

Computer Science major applications open

On March 1, applications open for the computer science and data science majors. The application deadline is May 25.

Students typically apply to a major while enrolled in fall semester courses during their sophomore year (third semester).

Submit your application at the appropriate link below:

All applicants will be notified of their admission decision via email within three weeks of the application deadline.

Colloquium: Toward human-centric language generation systems

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Dongyeop Kang (University of California, Berkeley), will be giving a talk titled "Toward human-centric language generation systems".

Abstract

Natural language generation (NLG) is a key component of many language technology applications such as dialogue systems, question-answering systems, automatic email replies, and story generation. Despite the recent advances of massive language models like GPT3, texts predicted by such systems are far from any human-like language. In fact, they most often produce either nonfactual text, incoherent text, or pragmatically inappropriate text. Also, the lack of interaction with real users makes the system less controllable and nonpractical. My research is focused on developing linguistically informed computational models in a wide range of generation tasks and building real-world NLG systems which can interact with humans. In this talk, I propose three steps to develop human-centric language generation systems: (i) Studying linguistic theories, (ii) Developing theory-informed models, and (iii) Building human-machine cooperative systems. My research lies at the intersection of three fields: computational linguistics as a theoretical basis, modern machine learning as a powerful technical tool, and human-computer interaction as a robust, reliable interactive testbed.

Biography

Dongyeop Kang is a postdoctoral scholar at the University of California, Berkeley. He obtained his Ph.D. in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University. His Ph.D. study has been supported by Allen Institute for AI (AI2) fellowship, CMU presidential fellowship, and ILJU graduate fellowship. During the study, he interned at Facebook AI research, AI2, and Microsoft Research.

Code Freeze 2021

There’s no doubt we live in an age of large-scale, software-intensive systems that involve complex interactions between humans, machines and the environment.  Sociotechnical thinking that considers the relationship between these systems is on the rise and, more than ever, there is demand in industry for holistic engineering approaches.

Join the University of Minnesota Software Engineering Center for the 16th annual Code Freeze event, as we delve into emerging humane engineering practices by and for people by exploring a range of subjects from human-centered design to collaborative software development. This year’s event will feature an interesting line up of industry leaders and practitioners offering thought-provoking talks and interactive workshops on topics such as learning models, burnout, fairness in AI, human centered design, internet of body (IoB) and more. Our productive and informative day of workshops and talks will be capped off by a live mob-programming session.

The registration fee for Code Freeze 2021 is $59 for the general public and $49 for alumni of the University of Minnesota MSSE and Computer Science programs. Check back to the UMSEC website as we post more information and details about this event. We look forward to your attendance!