# Past Events

## Lecture: Tamir Bendory

**Tuesday, Feb. 7, 2023, 1:25 p.m.** through **Tuesday, Feb. 7, 2023, 2:25 p.m.**

Zoom only

Data Science Seminar

Tamir Bendory (Tel Aviv University)

Registration is required to access the Zoom webinar.

**Title:** Multi-reference alignment: Representation theory perspective, sparsity, and projection-based algorithms

**Abstract: **Multi-reference alignment (MRA) is the problem of recovering a signal from its multiple noisy copies, each acted upon by a random group element. MRA is mainly motivated by single-particle cryo-electron microscopy (cryo-EM): a leading technology to reconstruct biological molecular structures. In this talk, I will analyze the second moment of the MRA and cryo-EM models. First, I will show that in both models the second moment determines the signal up to a set of unitary matrices, whose dimension is governed by the decomposition of the space of signals into irreducible representations of the group. Second, I will present sparsity conditions under which a signal can be recovered from the second moment, implying that the sample complexity is proportional to the square of the variance of the noise. If time permits, I will introduce a new computational framework for cryo-EM that combines a sparse representation of the molecule with projection-based techniques used for phase retrieval in X-ray crystallography.

## Identifying and achieving career goals

**Friday, Feb. 3, 2023, 1:25 p.m.** through **Friday, Feb. 3, 2023, 2:25 p.m.**

Walter Library 402

Industrial Problems Seminar

Brittany Baker (The Hartford)

Registration is required to access the Zoom webinar.

### Abstract

## Lecture: Nadav Dym

**Tuesday, Jan. 31, 2023, 1:25 p.m.** through **Tuesday, Jan. 31, 2023, 2:25 p.m.**

Zoom

Data Science Seminar

Nadav Dym (Technion-Israel Institute of Technology)

Registration is required to access the Zoom webinar.

**Title:**

**Abstract:**

A common theoretical requirement of an equivariant architecture is that it will be universal- meaning that it can approximate any continuous equivariant function. This question typically boils down to another theoretical question: assume that we have a group G acting on a set V, can we find a mapping f:V→R^m such that f is G invariant, and on the other hand f separates and two points in V which are not related by a G-symmetry? Such a mapping is essentially an injective embedding of the quotient space V/G into R^m, which can then be used to prove universality. We will review results showing that under very general assumptions such a mapping f exists, and the embedding dimension m can be taken to be 2dim(V)+1. We will show that in some cases (e.g., graphs) computing such an f can be very expensive, and will discuss our methodology for efficient computation of such f in other cases (e.g., sets). This methodology is a generalization of the algebraic geometry argument used for the well known proof of phase retrieval injectivity.

Based on work with Steven J. Gortler

## An Overview of Open Problems in Autonomous Systems

**Friday, Jan. 27, 2023, 1:25 p.m.** through **Friday, Jan. 27, 2023, 2:25 p.m.**

Zoom

Industrial Problems Seminar

Natalia Alexandrov (NASA Langley Research Center)

Registration is required to access the Zoom webinar.

### Abstract

Although humans will continue as active system participants, increasing system complexity will demand growing degree of machine autonomy. We can make good use of the developments in autonomous cars. However, the airspace environment is much less forgiving and presents special problems. It’s safety critical, time critical, and depends on certification. Question is: when can we trust an autonomous system in such environments? In this talk, I will give examples of open problems in the design and operation of autonomous systems and suggest where mathematical attention would be in order.

## Lecture: March Boedihardjo

**Tuesday, Jan. 24, 2023, 1:25 p.m.** through **Tuesday, Jan. 24, 2023, 2:25 p.m.**

Zoom only

Data Science Seminar

March Boedihardjo (ETH Zürich)

Registration is required to access the Zoom webinar.

**Title**: Spectral norm of random matrices

**Abstract**: Tropp's matrix concentration inequalities give sharp estimates for the spectral norm of many random matrices that arise in applications. However, even in the case when all the entries are i.i.d. standard Gaussian, the estimates are only sharp up to a log-dimension factor but not even sharp up to a constant. I will present an estimate for sums of independent random matrices that is sharp in many cases including the case when all the entries are i.i.d. standard Gaussian. Joint work with Afonso Bandeira and Ramon van Handel.

## Lecture: Luke Jacobsen and Jeff Lande

**Friday, Jan. 20, 2023, 1:25 p.m.** through **Friday, Jan. 20, 2023, 2:25 p.m.**

Walter Library 402 and Zoom (registration required)

Industrial Problems Seminar

Luke Jacobsen (Medtronic), Jeff Lande (Medtronic)

**Title:**Quantitative Careers in the Medical Device Industry

**Abstract:**We will give an overview of quantitative careers in the medical device industry, focusing on the role of the biostatistician in the Cardiac Rhythm Management (CRM) space. We will describe some CRM products and provide examples of work within clinical studies to demonstrate the safety and efficacy of these products including the use of alternative data sources to help address relevant clinical questions.

## Lecture: Mauro Maggioni

**Tuesday, Jan. 17, 2023, 1:25 p.m.** through **Tuesday, Jan. 17, 2023, 2:25 p.m.**

Walter Library 402 or Zoom

Data Science Seminar

Mauro Maggioni (Johns Hopkins University)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

**Title:** Two estimation problems for dynamical systems: linear systems on graphs, and interacting particle systems

**Abstract:** We are interested in problems where certain key parameters of a dynamical system need to be estimated from observations of trajectories of the dynamical systems. In this talk I will discuss two problems of this type.

The first one is the following: suppose we have a linear dynamical systems on a graph, represented by a matrix A. For example, A may be a random walk on the graph. Suppose we observe some entries of A, some entries of A^2, …, some entries of A^T, for some time T, and wish to estimate A. We are interested in the regime when the number of entries observed at each time is small relative to the total number of entries of A. When T=1 and A is low-rank, this is a matrix completion problem. When T>1, the problem is interesting also in the case when A is not low rank, as one may hope that sampling at multiple times can compensate for the small number of entries observed at each time. We develop conditions that ensure that this estimation problem is well-posted, introduce a procedure for estimating A by reducing the problem to the matrix completion of a low-rank structured block-Hankel matrix, obtain results that capture at least some of trade-offs between sampling in space and time, and finally show that this estimator can be constructed by a fast algorithm that provably locally converges quadratically to A. We verify this numerically on a variety of examples. This is joint work with C. Kuemmerle and S. Tang.

The second problem is when the dynamical system is nonlinear, and models a set of interacting agents. These systems are ubiquitous in science, from modeling of particles in Physics to prey-predator and colony models in Biology, to opinion dynamics in social sciences. Oftentimes the laws of interactions between the agents are quite simple, for example they depend only on pairwise interactions, and only on pairwise distance in each interaction. We consider the following inference problem for a system of interacting particles or agents: given only observed trajectories of the agents in the system, can we learn what the laws of interactions are? We would like to do this without assuming any particular form for the interaction laws, i.e. they might be “any” function of pairwise distances. We discuss when this problem is well-posed, we construct estimators for the interaction kernels with provably good statistically and computational properties, and discuss extensions to second-order systems, more general interaction kernels, and stochastic systems. We measure empirically the performance of our techniques on various examples, that include extensions to agent systems with different types of agents, second-order systems, families of systems with parametric interaction kernels, and settings where the interaction kernels depend on unknown variables. We also conduct numerical experiments to test the large time behavior of these systems, especially in the cases where they exhibit emergent behavior. This is joint work with F. Lu, J. Feng, P. Martin, J.Miller, S. Tang and M. Zhong.

## Optimal shrinkage of singular values under noise with separable covariance & its application to fetal ECG analysis

**Tuesday, Dec. 13, 2022, 1:25 p.m.** through **Tuesday, Dec. 13, 2022, 2:25 p.m.**

Walter Library 402 or Zoom

Data Science Seminar

Pei-Chun Su (Duke University)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

### Abstract

High dimensional noisy dataset is commonly encountered in many scientific fields, and a critical step in data analysis is denoising. Under the white noise assumption, optimal shrinkage has been well-developed and widely applied to many problems. However, in practice, noise is usually colored and dependent, and the algorithm needs modification. We introduce a novel fully data-driven optimal shrinkage algorithm when the noise satisfies the separable covariance structure. The novelty involves a precise rank estimation and an accurate imputation strategy. In addition to showing theoretical supports under the random matrix framework, we show the performance of our algorithm in simulated datasets and apply the algorithm to extract fetal electrocardiogram from the benchmark trans-abdominal maternal electrocardiogram, which is a special single-channel blind source separation challenge.

## Data Science to Software Engineering and Back Again

**Friday, Dec. 9, 2022, 1:25 p.m.** through **Friday, Dec. 9, 2022, 2:25 p.m.**

Walter Library 402

Industrial Problems Seminar

Cora Brown (Bridge Financial Technology)

You may attend the talk either in person in Walter 402 or register via Zoom. Registration is required to access the Zoom webinar.

### Abstract

In this talk I will discuss my early career as a Data Scientist and Software Engineer. The skills necessary for these two types of roles overlap and complement each other. Drawing on my experiences in both fields, I will share some of the skills I’ve found valuable in each position and why I’ve chosen to follow this path. I will focus on the ways in which developing solid software skills have made me a better Data Scientist. Finally, I will describe some of the specific problems I’ve worked on as a Data Scientist and Software Engineer and how a background in mathematics can aid in solving these problems.

## Equivariant machine learning

**Tuesday, Dec. 6, 2022, 1:25 p.m.** through **Tuesday, Dec. 6, 2022, 2:25 p.m.**

Walter 402 and virtually by Zoom (Zoom registration required)

Data Science Seminar

Soledad Villar (John Hopkins University)

### Abstract

In this talk we will give an overview of the enormous progress in the last few years, by several research groups, in designing machine learning methods that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), units scalings, and permutations. We show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincare groups, at any dimensionality d. The key observation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of (dimensionless) scalars -- scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable, and mention ongoing work on cosmology simulations.