The beauty of the numbers

Sara Algeri is an Assistant Professor of Statistics and a member of the data science graduate faculty. This interview is part of a series that highlights faculty research at the University of Minnesota Twin Cities College of Liberal Arts. Watch the full interview with Sara Algeri here.

How did you become interested in statistics originally?

Basically by chance. In Italy, first of all, you have to select your major already when you're still in high school before even entering the university. And so I decided to go for statistics, because during an orientation program, I got to know that statistics existed, because it's not really known as a discipline, even in Italy. And so one man who actually works in the National Institute for Statistics in Italy was talking in my school, and I got to know it exists.

I got to know it is a field of applied math, and I liked math. But I wasn't really sure of what I really wanted to do. I mean, I knew I liked mathematics, [but] I don't like the theory behind it. I like the applied side. I like physics, because in some sense it can be seen as an application of math. It has some practical utility. And so I didn't know if I wanted to go into engineering. I was also good at art, so I was considering architecture as well. I was interested in the medical sciences. I had no idea what to do. And statistics was the only one that somehow allowed me to keep all my doors open, in the sense that you can really apply statistics to any area.

I knew that there are applications in the environmental sciences, in medical sciences, in the physical sciences, in economics also. And so that's why I decided to go for statistics. Basically, to have the freedom of choosing the area of application later on and not at 17 years old.

Can you talk about the project you’re working on to help detect dark matter? How do you find something that we’re not even sure exists?

First of all, it actually goes long back. I was doing a master's in statistics at Texas A&M, and I attended a talk on the field of astrostatistics, which I didn't even know existed before.

I always had some interest in physics anyway. And already when I was in Italy I looked for applications in physics, but nobody really knew anything about it. Nobody could help me to address this interest. And when I got to know that this field existed (it's essentially statistics applied to physics or astronomy), I looked around for graduate programs in the area.

I managed to get a PhD fellowship at Imperial College in London. And the title of this fellowship was "statistical issues in the search for dark matter." I didn't even know myself what that method was at that point, and I don't really have a formal background in physics, so my understanding of the area itself was very limited.

But all I needed to know really was what we were trying to find? What were the statistical issues arising, like, if you want to formulate this physics problem into a statistical problem, what do we have to do and what are the statistical tools that we need?

And so my job was just to work on this statistical tool and develop some methodologies that can be of help in detecting dark matter. But I really didn't have to study the details, the physics, thankfully, otherwise, I probably wouldn't have gotten a PhD.

What makes your work unique in the field of astrostatistics?

My area itself. There are not many people who work in this area at the moment. And many people I know who work in this area, mostly work with astronomers rather than particle physicists. It might be because my advisor was a particle physicist that I'm more interested in this area. I mean, in some sense, I was kind of raised academically in this area. So I have a stronger interest in particle physics, which is slightly different from many people in their sort of statistics area.

So now, what I'm trying to do is to develop methods that can address problems in this area, but at the same time maintain the universal character of statistics, really. Meaning I'm motivated by problems in particle physics. I want to find/develop tools for signal detection, detection of new particles.

But then, I want the same tool to be able to help, say, biologists to find genes that might be related to certain diseases, or neuroscientists to find activity in the brain of people with certain problems, diseases, or mental issues.

I am really trying to make methods that are general, but then they may find immediate application in potentially groundbreaking scientific problems. I mean, if they really find dark matter, this is a huge deal.

So far, as nobody's found anything, the physics community believes that within the next few years, probably less than 10, they will be able to find it, which is reasonable because the experiments are becoming more and more powerful. And so it's really possible. But yeah, so far nobody has ever detected it, and I don't know if they've used my tools or not, I have developed them, and they publish them.

Do you have your own theory of what dark matter would look like?

Well, it's highly conditional on the physics theory, and most importantly of what people in the physics group with whom I collaborated think, so the most common hypothesis is that it is made out of WIMPs, which means weakly interactive massive particles. And so the idea of weakly interactive means they don't interact with light as normal particles, normal matter. They are massive in the sense that they have a lot of mass.

But the reason why—and this is part of why we call it dark matter—is because, again, these particles do not interact with light. So that's what makes them dark. But again, nobody's ever detected WIMPs. So that is an open area of research, still.

What do you hope that your students are going to get from your courses and from your teaching?

Well, what I always tell them is that I want them to develop a way of thinking in statistical terms such that they will not be replaceable by a robot in 20 years. Because when it comes to computational sciences, statistics is not really a computational science, but it has its computational sides. If somebody wanted to strictly apply statistical tools without really understanding them, it's possible.

So I want them to understand what they are doing and why they're doing it and why what they're doing works, rather than just, okay, I've applied it, this is what I've got, I'm happy with it. Because they have to make a reasoning that I didn't have to make when I was at their age, or at this stage, which is, I have to be able to develop some skills that will not be replaced by the automatization by artificial intelligence or whatever. And so they have to learn to think and they have to, hopefully, learn to be innovators in some sense. But this is a much bigger challenge.

What do you wish people knew about your research?

I think there is a process that I should explain to answer properly the question. So during my PhD, I was doing some research on inferential methods, meaning statistical methods that can tell you, okay, this difference is significant to say that we're significant, you must use certain statistical tools and so forth. So that's what the inference is, essentially.

The problem is that many of these methods were relying on complex mathematical constructs. So as soon as I start talking about this construct, to either a physics community or even just statistics community, to people who are not experts in that specific area of statistical theory, it felt like I was not creating an interest. Like, people were not really interested simply because it was kind of complicated to even just understand the framework. And so who will ever even apply to some field like this?

From that, I've learned that a solution can be as general as you want, as beautiful as you want, but if people cannot understand it, they will not use it. So the current area of research [that] I'm working on involves methods that are simple, fundamentally, in the sense that the theory behind, if somebody studies it properly, is really simple, but at the same time is really powerful. So that's the statistical framework I'm using. And that's the statistical framework where I'm developing my tools.

Essentially, I would like people to know that it's really nothing difficult, even if the equation may sound scary, sometimes the foundation is really simple. And that's what makes it beautiful.

This story was originally published by the School of Statistics.