Past Events

Musings from a Computer Vision Career

Evan Ribnick (Reveal Technology)

In this talk, I will share a high-level overview of some of my experiences working in industry as a computer vision engineer, and reflect on some of the important lessons learned from these experiences. In addition, I will offer some advice and insights that may be useful to grad students and others preparing to transition to industry. This includes highlighting some of the differences between academia and industry, and discussing the skills and behaviors that might help navigate this landscape. 

Evan Ribnick is currently a Principal Computer Vision Engineer at Reveal Technology. Prior to joining Reveal, he held positions at CyberOptics Corp. and 3M’s Corporate Research Lab, as well as consulting for other companies in the area of computer vision. He received a Ph.D. in Electrical and Computer Engineering at the University of Minnesota in 2009. His work has focused mainly on computer vision, 3D reconstruction, computational photography, and image processing, including applications of these in a broad range of industries and settings. He is the author of several peer-reviewed academic papers and patents, and has worked on products which have been employed and commercialized in various industries.

Auto-differentiable Ensemble Kalman Filters

Daniel Sanz-Alonso (University of Chicago)

Data assimilation is concerned with sequentially estimating a temporally-evolving state. This task, which arises in a wide range of scientific and engineering applications, is particularly challenging when the state is high-dimensional and the state-space dynamics are unknown. In this talk I will introduce a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters (AD-EnKFs) blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, AD-EnKFs leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Numerical results using the Lorenz-96 model show that AD-EnKFs outperform existing methods that use expectation-maximization or particle filters to merge data assimilation and machine learning. In addition, AD-EnKFs are easy to implement and require minimal tuning. This is joint work with Yuming Chen and Rebecca Willett.

Prof. Sanz-Alonso is an Assistant Professor in the Department of Statistics at the University of Chicago, and a member of the Committee on Computational and Applied Mathematics. His research addresses theoretical and compuational challenges motivated by data-centric applications in graph-based learning, inverse problems, and data assimilation. His work was recognized with the José Luis Rubio de Francia prize to the best Spanish mathematician under 32 by the Spanish Royal Society of Mathematics. Prof. Sanz-Alonso's research is funded by the National Science Foundation, the National Geospatial-Intelligence Agency, the Department of Energy, and the BBVA Foundation.

Before moving to Chicago, Prof. Sanz-Alonso was a postdoctoral research associate and a member of the Data Science Initiative at Brown University. He completed his Ph.D. in Mathematics and Statistics at the University of Warwick, UK.

On Multiclass Adversarial Training, Perimeter Minimization, and Multimarginal Optimal Transport Problems

Nicolas Garcia Trillos (University of Wisconsin, Madison)

Adversarial training is a framework widely used by machine learning practitioners to enforce robustness of learning models. Despite the development of several computational strategies for adversarial training and some theoretical development in the broader distributionally robust optimization literature, there are still several theoretical questions about adversarial training that remain relatively unexplored. One such question is to understand, in more precise mathematical terms, the type of regularization enforced by adversarial training in modern settings like non-parametric classification as well as classification with deep neural networks. In this talk, I will present a series of connections between adversarial training and several problems in the calculus of variations, geometric measure theory, and multimarginal optimal transport. These connections reveal a rich geometric structure of adversarial problems and conceptually all aim at answering the question: what is the regularization effect induced by adversarial training? In concrete terms, I will discuss an equivalence between a family of adversarial training problems for non-parametric classification and a family of regularized risk minimization problems where the regularizer is a nonlocal perimeter functional. I will also present a result with interesting computational implications: to solve certain adversarial training problems for classification, it is enough to solve a suitable multimarginal optimal transport problem where the number of marginals is equal to the number of classes in the original classification problem.

This talk is based on joint works with Ryan Murray, Camilo García Trillos, Leon Bungert, Jakwang Kim, Matt Jacobs, and Meyer Scetbon.

Nicolas Garcia Trillos is currently an Assistant Professor in the Department of Statistics at the University of Wisconsin-Madison. He finished his PhD in mathematics at Carnegie Mellon University in 2015. His academic interests lie at the intersection of applied analysis, applied probability, statistics, and machine learning.  

Data Science @ Meta

Zeinab Takbiri (Facebook)

Abstract has been removed at the request of the speaker.

Integrative Discriminant Analysis Methods for Multi-view Data

Sandra Safo (University of Minnesota, Twin Cities)

Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, thus requiring a holistic approach in understanding the complexity and heterogeneity.  In this talk, I will present some of our current statistical and machine learning methods for integrating data from multiple sources while simultaneously classifying units or individuals into one of multiple classes or disease groups. The proposed methods are tested using both simulated data and real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.

Sandra Safo is an Assistant Professor of Biostatistics at the University of Minnesota. She is interested in developing statistical learning, data integration, and feature selection methods for high-dimensional data. Currently, she develops methods for integrative analysis of “omics” (including genomics, transcriptomics, and metabolomics) and clinical data to help elucidate the complex interactions of these multifaceted data types.

 

Towards a Better Evaluation of Football Players

Eric Eager (ProFootballFocus (PFF))

The game of football is undergoing a significant shift towards the quantitative. Much of the progress made in the analytics space can be attributed to play-by-play data and charting data.  However, recent years have given rise to tracking data, which has opened the door for innovation that was not possible before. In this talk I will describe how to gain an edge in player evaluation by building off of traditional charting data with state-of-the-art player tracking data, and foreshadow how such methods will revolutionize the sport of football in the future.

Eric Eager is the head of research, development and innovation at PFF, a worldwide leader in sports data and analytics. Prior to joining PFF, Eric earned a PhD in mathematical biology from the University of Nebraska, publishing 25 papers in applied mathematics, mathematical biology, ecology and the scholarship of teaching and learning. Eric is a native of Maplewood, MN.

Graph Clustering Dynamics: From Spectral to Mean Shift

Katy Craig (University of California, Santa Barbara)

Clustering algorithms based on mean shift or spectral methods on graphs are ubiquitous in data analysis. However, in practice, these two types of algorithms are treated as conceptually disjoint: mean shift clusters based on the density of a dataset, while spectral methods allow for clustering based on geometry. In joint work with Nicolás García Trillos and Dejan Slepčev, we define a new notion of Fokker-Planck equation on graph and use this to introduce an algorithm that interpolates between mean shift and spectral approaches, enabling it to cluster based on both the density and geometry of a dataset. We illustrate the benefits of this approach in numerical examples and contrast it with Coifman and Lafon’s well-known method of diffusion maps, which can also be thought of as a Fokker-Planck equation on a graph, though one that degenerates in the zero diffusion limit.

Katy Craig is an assistant professor at the University of California, Santa Barbara, specializing in partial differential equations and optimal transport. She received her PhD from Rutgers University in 2014, after which she spent one year at UCLA as an NSF Mathematical Sciences Postdoctoral Fellow and one year at UCSB as an UC President’s Postdoctoral Fellow.

Best Practices A Data Scientist Should Know

Hande Tuzel (Sabre Corporation)

Slides

In this talk, Hande will give an overview of some of the best practices a data scientist should know. These will include topics like virtual environments, utilizing functions, code documentation and other things that you could start incorporating in your data science projects or coding in general. She will also include some quick tips and advice on how to prepare for a Data Scientist job interview. Hopefully, these will help you prepare for a successful career in industry.

Hande received her PhD in Applied Mathematics from University of Minnesota in 2009, under the supervision of Fadil Santosa. Her dissertation was on improvement of mask design in integrated circuit printing technologies using level set methods. After a decade of experience in academia training future scientists and engineers, she decided to transition to industry. She is now a self-taught Data Scientist currently working at Sabre Labs Research. If she is not busy coding or reading a paper, you can find her hiking, crocheting hats or practicing inversions as a yogi.

Decomposing Low-Rank Symmetric Tensors

Joe Kileel (The University of Texas at Austin)

In this talk, I will discuss low-rank decompositions of symmetric tensors (a.k.a. higher-order symmetric matrices).  I will start by sketching how results in algebraic geometry imply uniqueness guarantees for tensor decompositions, and also lead to fast and numerically stable algorithms for calculating the decompositions.  Then I will quantify the associated non-convex optimization landscapes.  Finally, I will present applications to Gaussian mixture models in data science, and rigid motion segmentation in computer vision.  Based on joint works with João M. Pereira, Timo Klock and Tammy Kolda.

Data-Model Fusion to Predict the Impacts of Climate Change on Mosquito-borne Diseases

Carrie Manore (Los Alamos National Laboratory)

Mosquito-borne diseases are among the many human-natural systems that will be impacted by climate change. All of the life stages and development rates of mosquitoes are impacted by temperature and other environmental factors, and often human infrastructure provides habitat  (irrigation, containers, water management, etc). This poses a very interesting mathematical modeling problem: how do we account for relevant factors, capture the nonlinearities, and understand the uncertainty in our models and in the data used to calibrate and validate the models? I will present several models, ranging from continental to fine scale and from statistical and machine learning to mechanistic, that we are using to predict mosquito-borne diseases and how they will be impacted by climate change. Over 30 people have worked together on this project, including students, postdocs, and staff. Our team is interdisciplinary and tasked with addressing critical national security problems around human health and climate change.