Professor Marc Riedel named Fellow of Oracle’s Research Fellows Program
As an Oracle Fellow, Riedel will lead a collaboration between the University of Minnesota Twin Cities and the Mayo Clinic to computationally predict immune response to pathogens
Professor Marc Riedel has been named a Fellow of the inaugural cohort of Oracle’s Research Fellows Program. Riedel will lead the collaboration between the University of Minnesota Twin Cities and the Mayo Clinic on computationally predicting the strength of the bond between viral peptides and cell-surface molecules, which determines individual immune response. The current project will focus on developing algorithms that are specifically targeted at the SARS-Cov-2 virus. Titled, “UMN-Mayo Computational Human Immuno-Peptidome (CHIP) Project,” the work will have far reaching implications for treatments and vaccine development for diseases caused by other pathogens too.
Established under the aegis of Oracle for Research, the Research Fellows Program focuses on supporting research using computational methods, across disciplines. The program identifies transformative research proposals in a competitive process, and supports winning proposals with cloud credits, technical advice and collaboration, and financial support.
Tackling a grand challenge
The human body’s immune response to a virus depends on whether viral protein fragments bind to the grooves on the cell surface. How strongly the peptide molecule (fragment of the protein from a pathogen) binds to the major histocompatibility complex class I (MHC-I) molecule (that resides on the surface of most of our cells) will determine the strength of the body’s immune response. The peptide binds only if it fits perfectly into the cleft on the surface of the MHC-I molecule. It is this binding that allows T cells, white blood cells that play a key role in our defense mechanism, to identify and kill off infected cells. The strength of the bond depends on the biochemical affinity between the peptide molecule and the MHC-I molecule, which in turn depends on the molecular shape. There are many variants of the MHC-I molecule across the human population, and the set of the peptides that can perfectly bind to an individual’s MHC-I molecules is called their immunopeptidome. A peptide will bind only if it fits perfectly into the cleft of the MHC-I molecule like a key into a lock.
Riedel and his team of scientists aim to characterize the binding strength between the peptides of the SARS-CoV-2 virus and the MHC-I molecules, a task that is seemingly simple. However, when one considers the enormous scale of the problem, the challenging nature of the project quickly becomes apparent. The SARS-CoV-2 virus has 29 distinct viral proteins which translate to approximately 38,000 peptides. Every individual has up to six variants of the MHC-I molecule, and there are at least 21,000 variants in the human population. At the very least, we are looking at pairing each of the 38,000 peptides of SARS-CoV-2 with the 21,000 variants of MHC-I molecules, which translates to three-quarters of a billion distinct pairings. Using existing simulation software, the hours to days of computing time required per pairing now extends to billions of hours or even billions of days of computing time for all the combinations of peptides to MHC-I molecules. In such a scenario, characterization of binding strength using current tools sounds like an untenable assignment. The scale of difficulty and the potential impact of a solution makes the problem a grand challenge.
Commenting on the scale of the problem and the sweeping impact of a solution, Professor Marc Riedel says, “We are tackling a problem that computer science currently judges to be very difficult. It is a foundational problem in computational immunology which, if solved, could inform predictions of disease severity, enable treatments, and guide vaccine development. With this grant we will develop new, highly targeted algorithms to make the computation tractable. We will turn billions of days of computing into millions of minutes. We are very excited."
Novel approach that rises up to the challenge
The team plans to develop new, highly targeted algorithms to overcome the challenge, and make the computation manageable. Two key factors make the project even more ambitious beyond writing specialized algorithms. Firstly, not all the biochemical aspects of binding are clearly understood from a computational perspective. Secondly, the lack of structural models for novel peptides, and more significantly, most variants of the MHC-I molecules complicates the team’s task. Their goal of developing targeted algorithms is therefore a complex one. They will be drawing on knowledge from structural and molecular biology for their computational models, which will help them successfully predict the binding strength between the molecules. They will also be relying on their multidisciplinary expertise to construct new models of the MHC-I molecules (rather than wait for them to be characterized experimentally). A significant and unique aspect of the project lies in how the characterization activity will happen alongside the application of such characterization.
What Riedel and the team of scientists are seeking to accomplish is simulate large scale peptide binding at the level of physical chemistry; i.e., pairing tens of thousands of peptides with tens of thousands of MHC-I molecules. Tools that are currently in use are based on neural networks that are trained on textual data: text labels for MHC-I molecules are paired with amino acid letter sequences of peptides, and scored based on their binding strength. The network does predict how strongly a peptide will bind with an MHC-I molecule, however it does so according to the similarity of the amino acid sequence alone. It does not take into consideration molecular shape or binding chemistry. This, and the fact that the predictions offered by neural networks are statistical inferences, often based on peptides that are dissimilar to those in novel pathogens, means that their results run the risk of not being granular enough and/or deliver false positives.
To overcome these challenges, Riedel’s team will incorporate molecular shape and biochemistry of binding for more exact results. In a novel approach, they will use a three-dimensional molecular model of the peptide and the MHC-I molecule, and use existing software to simulate protein folding. The team will also depend on, and actively use domain-specific knowledge to be as accurate as possible in their placement of peptides in the binding packets. The team’s goal is to reduce the one billion days of simulation time to merely a million minutes.
From algorithms to applications
Although the current project focuses on delivering quick outcomes, a factor that is critical to pandemic preparedness, the researchers are also aiming at developing and strengthening computational immunology to address other diseases as well. With both near term and long term goals in mind, the team will lean on the expertise of a multidisciplinary team. Besides Riedel, the team comprises Dr. George Vasmatzis and Dr. Matthew Block of the Mayo Clinic, Professor Jim Cornette of Iowa State University, and Julia Udell (doctoral student from the Department of Computer Science and Engineering at the University). The Vasmatzis lab at the Mayo Clinic has been working on genomics and computational techniques for cancer immunotherapy, and the Block lab has focused on patient immune response. With the onset of the COVID-19 pandemic, the latter has developed methods to measure immune response to SARS-CoV-2. Udell (who is being jointly advised by Riedel and Vasmatzis) will bring her experience as a biostatistician and her authorship of a neoantigen ranking algorithm to bear on the project. Principal investigator on the project, Riedel’s experience and know-how in circuit design and molecular dynamics will provide the critical underpinning needed for the project. The team will be bringing together their prowess in molecular modeling and immunology, access to clinical data, skill in molecular simulation, and experience developing and deploying large-scale computational projects.
The tools developed in the course of the project will help the team identify commonly occurring genomic variants across the United States population that make individuals more or less vulnerable to COVID-19. When the project concludes, the team plans to provide risk assessment of COVID-19 severity on a publicly accessible web-based platform.
For the team, the disease fighting potential of the project is exciting. The tools and methods used and developed in the course of this study hold significant far reaching potential. The key outcomes from the project can be extended to other diseases too, to include the ability to predict disease severity for new pathogens for different individuals, for different variants of the same virus for different individuals, and differing vaccine effectiveness for different variants. The work undertaken by the team will be instrumental in tailoring therapy including vaccines to individuals depending on their immune response, and better preparedness to handle new or as yet unknown pathogens.