Machine Learning Seminar Series

Physics-informed machine learning for molecular simulations in nanoporous materials discovery

by

Yangzesheng Sun
Department of Chemistry and
Department of Computer Science & Engineering
University of Minnesota

Wednesday, October 14, 2020
3:30–4:30 pm

View recording here

Nanoporous materials are promising candidates for clean-energy chemical storage and separation processes, and a combination of high-throughput molecular simulations with machine learning methods has dramatically accelerated the design of chemical systems involving nanoporous materials. Although machine learning has shown great success in various domains, physical inductive biases play a crucial role in the generalization and transferability of machine learning models for materials discovery. Based on the statistical thermodynamic nature of molecular simulations, the loss function of a machine learning model predicting molecular simulation results can be structured as minimizing the KL divergence between the statistical thermodynamics distribution and the approximating distribution parametrized by the model. When multiple types of guest molecules are separated by a nanoporous material, a strongly physics-informed and interpretable machine learning model based on the Transformer was developed by drawing an analogy between molecules in a chemical system and words in natural language. With almost trivial modifications to the original Transformer, the model dramatically outperformed a regular neural network on generalization in the state space for the separation of an 8-component benzene derivative mixture. Besides integrating physical principles into model architecture, meta-learning was also an effective approach to directly learning physical inductive biases from data. A metal-learning model was developed to jointly learn the hydrogen storage properties of multiple materials at multiple thermodynamic states using many small simulation datasets, improving extrapolation and few-shot performance. While machine learning was mainly employed as surrogate models for simulations, data-driven or differentiable simulations have also emerged as a new research direction.

Yangzesheng Sun (Department of Chemistry and Department of Computer Science & Engineering, University of Minnesota) Yangzesheng (Andrew) Sun is a PhD student in the Department of Chemistry at University of Minnesota, advised by prof. J. Ilja Siepmann. His research interests include physics-informed machine learning, Monte Carlo simulations for molecular systems, and high-performance computing. He received BS in Chemistry with Honors from Wuhan University in 2017.