Advancing Efficient and Trustworthy AI for Science, Engineering, and Medicine
Assistant Professor
Computer Science and Engineering
Behind Current AI systems: Data Crime and Trust Crisis AI’s current success and future impacts cannot be overstated. AI has already touched many corners of our daily lives, AI has been a designated national priority (by the White House) since 2016, and in 2024, AI has (controversially) earned its community two Nobel Prizes, in physics and chemistry, respectively.
ChatGPT, which makes headlines everywhere and raises alarms about the prospect of massive AI-triggered job loss, is a gigantic question-and-answer AI system. It learns from at least tens of billions of web pages, Wikipedia articles, news reports, books, and interactive sessions with human experts: data is the oil for this expensive conversational machine. However, ChatGPT has been criticized for hallucinating: it often makes up facts and reasons in a seemingly logical but flawed manner, producing fallacies and causing concerns.
The data crime and trust crisis around ChatGPT is a common hallmark of many modern AI systems: at the hearts of these systems are a family of revolutionary computing models called deep neural networks (DNNs), that learn from massive amounts of data, but unfortunately, do not always make reliable predictions. These issues around DNNs are roadblocks to applying AI to scenarios where data are scarce and/or the cost of wrong predictions is high, such as scientific discovery and medical diagnosis.
The overarching goal of my research is to develop theoretical foundations and computing tools to significantly reduce AI’s data dependency and boost AI’s reliability for high-stakes applications in science, engineering, and medicine, through deep interaction and integration with domain scientists.
Saving data: integrating knowledge and data, DNNs learn from massive amounts of data by discovering and extracting informative patterns inside the data, emulating thousands of years of human learning and knowledge accumulation. However, most current DNNs learn from scratch, ignoring the pieces of human knowledge that supported various disciplines' foundations and developments before AI’s arrival. So, integrating domain knowledge into the DNN learning process is sensible, especially when data are scarce: prior knowledge fills in the gap not encoded by limited available data. Moving from the modern large-data but low-efficiency AI to small-data high-efficiency AI is a crucial ongoing transition in the frontier of AI research and applications.
Prior knowledge, such as physical laws, often takes the form of mathematical constraints that cannot be violated during DNN learning—the latter typically entailing mathematical optimization. So integrating knowledge into DNN’s learning process often leads to optimization problems with highly nontrivial constraints, a computational barrier that researchers and practitioners in these domains must overcome. To this end, my group has developed the first general-purpose computing framework, NCVX (https://ncvx.org/), to handle such constrained DNN learning reliably. This package has quickly enabled multiple exciting developments with our collaborators, such as flexibly evaluating the robustness of DNNs, automating layout design for advanced manufacturing, and faithfully predicting the properties of novel materials for AI-enabled materials discovery.
Sometimes, data scarcity can be extreme, and one might only get a single data point. This often happens when one tries to infer the structures and properties of extreme-scale objects through scientific observations (e.g., nanoscale materials in materials science and black holes in astrophysics), i.e., scientific inverse problems. Surprisingly, even for such single-instance scenarios, one can still take advantage of DNNs—called single-instance deep generative priors, by coupling them with faithful modeling of the observation processes. However, such single-instance learning is ruined when the observations are corrupted. We have invented the first general method for single-instance learning to stop the degradation caused by various types and levels of corruption, as well as the first method to accelerate the learning process.
By carefully customizing and optimizing these ideas, we have made breakthroughs on a couple of central inverse problems in scientific imaging: for phase retrieval, our new method has the potential to replace a 40-year-old gold-standard method with much less parameter tuning needed; for blind image deblurring (BID), our novel technique works in an unprecedented regime and often beats state-of-the-art methods that learn from massive data. Building on our new BID method, we are collaborating with our material science colleagues to sharpen the current imaging techniques for nanomaterial study.
Earning trust: improving AI’s reliability and discretion Among the numerous directions to make AI trustworthy, my group has focused on the robustness issue: in worst-case scenarios, DNNs can easily make wrong predictions when the input is slightly perturbed—even imperceptible to human operators, i.e., the adversarial robustness issue. This means that in AI-enabled autonomous driving and medical diagnosis, the underlying AI systems can lead to surprising mistakes and liabilities. To ensure robustness, a critical ingredient is to be able to computationally evaluate the level of robustness of any DNN, i.e., robustness evaluation (RE). Based on our NCVX and additional speedup techniques, my group has built the first reliable RE tool that can work for virtually any type of perturbation, vs. previous tools that can work for three types of perturbation whose practical relevance is still debatable. Moreover, we also discover that the current efforts to make DNNs less sensitive to input perturbations push them too much: the resulting DNNs are not bold enough to make an alternative prediction when necessary.
On the other hand, before we can obtain robust DNNs or trustworthy DNNs in general, we can safeguard our DNNs against making severe mistakes—imperfect DNNs do not mean they are not helpful or deployable. A promising approach is to allow them to restrain from making predictions on unconfident cases and defer these cases to humans, i.e., selective prediction, that tradeoffs the error rate (i.e., risk) and percentage of cases on which they actually make predictions (i.e., coverage). For this, we have derived a lightweight selective prediction method that works effectively even if the deployment environments are different from DNN’s original learning environment, among the first of its kind. In fact, such environment shifts between learning and deployment are among the primary factors that cause prediction errors and uncertainty.
Caring lives: advancing medical AI research Our passion for data-knowledge integration and trustworthy and safe AI is constantly fueled by our extensive collaboration with numerous domain experts, especially medical scientists. AI for healthcare and medicine has the real potential to overcome resource and workforce shortages in current healthcare systems and boost the standard of care to improve patient lives. There, besides specializing and optimizing our foundational AI results to automated medical diagnosis, prognosis, and drug designs (e.g., we developed the first accurate and deployable AI-enabled diagnostic model for COVID in 2020 and deployed it in clinical systems at 12 local hospitals under M Health Fairview), we have to keep rethinking and revamping the foundations of AI, informed and inspired by the unconventional problems often faced in healthcare. For example, we are the first to reexamine our standard ideas for learning from imbalanced biomedical data and find substantial suboptimality in them. We are working on polishing selective prediction and related methods to ensure patient safety in AI-enabled clinical systems.