Past Events

How Well Can We Generalize Nonlinear Learning Models in High Dimensions??

Inbar Seroussi (Weizmann Institute of Science)

Modern learning algorithms such as deep neural networks operate in regimes that defy the traditional statistical learning theory. Neural networks architectures often contain more parameters than training samples. Despite their huge complexity, the generalization error achieved on real data is small. In this talk, we aim to study the generalization properties of algorithms in high dimensions. We first show that algorithms in high dimensions require a small bias for good generalization. We show that this is indeed the case for deep neural networks in the over-parametrized regime. We, then, provide lower bounds on the generalization error in various settings for any algorithm. We calculate such bounds using random matrix theory (RMT). We will review the connection between deep neural networks and RMT and existing results. These bounds are particularly useful when the analytic evaluation of standard performance bounds is not possible due to the complexity and nonlinearity of the model. The bounds can serve as a benchmark for testing performance and optimizing the design of actual learning algorithms. Joint work with Ofer Zeitouni, more information in

Inbar Seroussi is a postdoctoral fellow in the mathematics department at the Weizmann Institute of Science, hosted by Prof. Ofer Zeitouni. Previously, she completed her Ph.D. in the applied mathematics department at Tel-Aviv University under the supervision of Prof. Nir Sochen. Her research interest includes modeling of complex and random systems in high dimensions with application to modern machine learning, physics and medical imaging. She develops and uses advanced tools drawn from statistical physics, stochastic calculus, and random matrix theory.

Data Science in Business vs. Academia

Philippe Barbe (Paramount)

This talk discusses similarities and differences between doing data science in academic and business environment. What are the relevant main differences between these environments? Why are the problem of different complexities? What is helpful to know? It builds on my years of experience doing both. All questions are welcome.

Philippe Barbe, PhD, is Senior Vice President of Content Data Science at Paramount (formerly ViacomCBS). In this role Philippe is responsible for data science modeling to inform content exploitation decisions across Paramount businesses. His team builds predictive models that support highly critical multi-million dollar content-related decisions in collaboration with many data science and research groups across Paramount.

Philippe received a PhD in mathematics and statistics from University Pierre et Marie Curie in Paris, France (currently Sorbonne University) and degree in management and government from ENSAE. He worked for over 20 years at the CNRS, as mathematician specialized in data science and related fields. He authored or co-authored 5 books and numerous scientific papers. He has been an invited professor in many universities worldwide, including Yale and GeorgiaTech in the US. He has been working in the media and entertainment industry since 2015.

Method of Moments: From Sample Complexity to Efficient Implicit Computations

Joao Pereira (The University of Texas at Austin)

In this talk, I focus on the multivariate method of moments for parameter estimation. First from a theoretical standpoint, we show that in problems where the noise is high, the number of observations necessary to estimate parameters is dictated by the moments of the distribution. Second from a computational standpoint, we address the curse of dimensionality: the d-th moment of an n-dimensional random variable is a tensor with nd entries. For Gaussian Mixture Models (GMMs), we develop numerical methods for implicit computations with the empirical moment tensors. This reduces the computational and storage costs, and opens the door to the competitiveness of the method of moments as compared to expectation maximization methods. Time permitting, we connect these results to symmetric CP tensor decomposition and sketch a recent algorithm which is faster than the state-of-the-art and comes with guarantees. Collaborators include Joe Kileel (UT Austin), Tamara Kolda ( and Timo Klock (Deeptech).

João is a postdoc in the Oden Institute at UT Austin, working with Joe Kileel and Rachel Ward. Previously, he was a postdoc at Duke University, working with Vahid Tarokh, and obtained is Ph.D. degree in Applied Mathematics at Princeton University, advised by Amit Singer and Emmanuel Abbe. This summer, he will join IMPA, in Rio de Janeiro, Brazil, as an assistant professor. He is broadly interested in tensor decompositions, information theory and applied mathematics.

Creating Value in PE Using Advanced Analytics

Erik Einset (Global Infrastructures Partners)

Value creation in private equity investment portfolios is fundamental to delivering results for PE customers. Our focus is in the energy and transportation sectors, and by having deep understanding of how these industries work, we explore applications where advanced analytics and better use of data can create more efficient operations and growth, which translates into increased earnings and value. We will discuss how value is created and some specific use cases where we believe there are opportunities to apply advanced analytics.

Erik has over 30 years of experience in various engineering and leadership roles, including 17 years at GE in R&D, product development, process improvement, technical sales, and management.  Since 2008, he has been a member of the Business Improvement team at Global Infrastructure Partners, working in a variety of infrastructure businesses in the energy and transportation sectors.  Erik is the author of 6 patents and numerous technical publications, and holds Chemical Engineering degrees from Cornell University (BS) and the University of Minnesota (PhD).

Relaxing Gaussian Assumptions in High Dimensional Statistical Procedures

Larry Goldstein (University of Southern California)

The assumption that high dimensional data is Gaussian is pervasive in many statistical procedures, due not only to its tail decay, but also to the level of analytic tractability this special distribution provides. We explore the relaxation of the Gaussian assumption in Single Index models and Shrinkage estimation using two tools that originate in Stein’s method: Stein kernels, and the zero bias transform. Taking this approach leads to measures of discrepancy from the Gaussian that arise naturally from the nature of the procedures considered, and result in performance bounds in contexts not restricted to the Gaussian. The resulting bounds are tight in the sense that they include an additional term that reflects the cost of deviation from the Gaussian, and vanish for the Gaussian, thus recovering this particular special case.

Joint work with: Xiaohan Wei, Max Fathi, Gesine Reinert, and Adrien Samaurd

Larry Goldstein received his PhD in Mathematics from the University of California, San Diego in 1984, and is currently Professor in the department of Mathematics at the University of Southern California in Los Angeles. His main area of study is the use of Stein's method for distributional approximation and its applications in statistics, and he also has interests in concentration inequalities, sequential analysis and sampling schemes in epidemiology.

Multi-Agent Autonomy and Beyond: A Mathematician’s Life at GDMS

Ben Strasser (General Dynamics Mission Systems)

Multi-agent autonomy is a broad field touching a wide variety of topics, including control theory, hybrid system verification, game theory, reinforcement learning, information theory, and network optimization. Agents must carefully use limited computational resources to perform complex and collaborative tasks while contending with both in-team information imbalances and non-collaborating agents. This talk provides a high-level overview of the multi-agent autonomy problem space and identifies several practical and theoretical challenges we face. I discuss recent work in multi-agent autonomy and my experience as a mathematician at GDMS.  I recommend this talk for any mathematics students considering a career in industry, as well as all parties with interest in problems related to multi-agent autonomy.

Using Artificial Intelligence to Model and Support the Management of Multimorbid Patients

Martin Michalowski (University of Minnesota, Twin Cities)

Multimorbidity, the coexistence of two or more health conditions, has become more prevalent as mortality rates in many countries have declined and their populations have aged. Multimorbidity presents significant difficulties for Clinical Decision Support Systems (CDSS), particularly in cases where recommendations from relevant clinical guidelines offer conflicting advice. An active area of research is focused on developing computer-interpretable guideline (CIG) modeling formalisms that integrate recommendations from multiple Clinical Practice Guidelines (CPGs) for knowledge-based multimorbidity decision support. In this talk, I will present our work on the development of a framework for comparing the different approaches to multimorbidity CIG-based clinical decision support (MGCDS) and our work on building an AI planning-based system called MitPlan that addresses the MGCDS problem. I will use clinical scenarios to demonstrate the sets of features key to providing MGCDS and how MitPlan provides real-world clinical decision support.

Dr. Michalowski is an Assistant Professor in the School of Nursing and a member of the Nursing Informatics Faculty at the University of Minnesota. He is a Senior Researcher in the Mobile Emergency Triage (MET) Research Group at the University of Ottawa and serves as Director of Machine Learning Research at His research portfolio includes novel contributions in the areas of information integration, record linkage, heuristic-based planning, constraint satisfaction problems, and leveraging artificial intelligence (AI) methods in nursing informatics research. His interdisciplinary research brings advanced AI methods and models to clinical decision support at the point of care and to personalized medicine. He strives to improve patient outcomes by engaging nurses as leaders in the development and adoption of AI-based technology in health care.

Dr. Michalowski earned his Ph.D. in Computer Science from the University of Southern California, where he solved automated reasoning problems. In 2018 he was elected Senior Member of the Association for the Advancement of Artificial Intelligence (AAAI) and in 2021 he was named to the Fellows of the American Medical Informatics Association (FAMIA). He authored and co-authored over 75 peer-reviewed articles on a range of AI-related topics and served on the program committees for various informatics and computer science conferences including AAAI, AMIA, IJCAI, ACMGIS, ICAPS, and ISWC. Dr. Michalowski is the organizing chair of the International Workshop on Health Intelligence (W3PHIAI) that is held at the AAAI annual conference. He was co-chair of the 2020 International Conference on Artificial Intelligence in Medicine (AIME 2020) and serves in the same role for AIME 2022. His research has received funding from the NSF, NIH, DARPA, DoD, and various private foundations. His work has resulted in two patents and several startup companies.

Musings from a Computer Vision Career

Evan Ribnick (Reveal Technology)

In this talk, I will share a high-level overview of some of my experiences working in industry as a computer vision engineer, and reflect on some of the important lessons learned from these experiences. In addition, I will offer some advice and insights that may be useful to grad students and others preparing to transition to industry. This includes highlighting some of the differences between academia and industry, and discussing the skills and behaviors that might help navigate this landscape. 

Evan Ribnick is currently a Principal Computer Vision Engineer at Reveal Technology. Prior to joining Reveal, he held positions at CyberOptics Corp. and 3M’s Corporate Research Lab, as well as consulting for other companies in the area of computer vision. He received a Ph.D. in Electrical and Computer Engineering at the University of Minnesota in 2009. His work has focused mainly on computer vision, 3D reconstruction, computational photography, and image processing, including applications of these in a broad range of industries and settings. He is the author of several peer-reviewed academic papers and patents, and has worked on products which have been employed and commercialized in various industries.

Auto-differentiable Ensemble Kalman Filters

Daniel Sanz-Alonso (University of Chicago)

Data assimilation is concerned with sequentially estimating a temporally-evolving state. This task, which arises in a wide range of scientific and engineering applications, is particularly challenging when the state is high-dimensional and the state-space dynamics are unknown. In this talk I will introduce a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters (AD-EnKFs) blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, AD-EnKFs leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Numerical results using the Lorenz-96 model show that AD-EnKFs outperform existing methods that use expectation-maximization or particle filters to merge data assimilation and machine learning. In addition, AD-EnKFs are easy to implement and require minimal tuning. This is joint work with Yuming Chen and Rebecca Willett.

Prof. Sanz-Alonso is an Assistant Professor in the Department of Statistics at the University of Chicago, and a member of the Committee on Computational and Applied Mathematics. His research addresses theoretical and compuational challenges motivated by data-centric applications in graph-based learning, inverse problems, and data assimilation. His work was recognized with the José Luis Rubio de Francia prize to the best Spanish mathematician under 32 by the Spanish Royal Society of Mathematics. Prof. Sanz-Alonso's research is funded by the National Science Foundation, the National Geospatial-Intelligence Agency, the Department of Energy, and the BBVA Foundation.

Before moving to Chicago, Prof. Sanz-Alonso was a postdoctoral research associate and a member of the Data Science Initiative at Brown University. He completed his Ph.D. in Mathematics and Statistics at the University of Warwick, UK.

On Multiclass Adversarial Training, Perimeter Minimization, and Multimarginal Optimal Transport Problems

Nicolas Garcia Trillos (University of Wisconsin, Madison)

Adversarial training is a framework widely used by machine learning practitioners to enforce robustness of learning models. Despite the development of several computational strategies for adversarial training and some theoretical development in the broader distributionally robust optimization literature, there are still several theoretical questions about adversarial training that remain relatively unexplored. One such question is to understand, in more precise mathematical terms, the type of regularization enforced by adversarial training in modern settings like non-parametric classification as well as classification with deep neural networks. In this talk, I will present a series of connections between adversarial training and several problems in the calculus of variations, geometric measure theory, and multimarginal optimal transport. These connections reveal a rich geometric structure of adversarial problems and conceptually all aim at answering the question: what is the regularization effect induced by adversarial training? In concrete terms, I will discuss an equivalence between a family of adversarial training problems for non-parametric classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. I will also present a result with interesting computational implications: to solve certain adversarial training problems for classification, it is enough to solve a suitable multimarginal optimal transport problem where the number of marginals is equal to the number of classes in the original classification problem.

This talk is based on joint works with Ryan Murray, Camilo García Trillos, Leon Bungert, Jakwang Kim, Matt Jacobs, and Meyer Scetbon.

Nicolas Garcia Trillos is currently an Assistant Professor in the Department of Statistics at the University of Wisconsin-Madison. He finished his PhD in mathematics at Carnegie Mellon University in 2015. His academic interests lie at the intersection of applied analysis, applied probability, statistics, and machine learning.