Mind the data gap: Using machine learning to further global ecology

CS&E professor part of new U of M Biology Integration Institute

As machine learning and artificial intelligence (AI) become exponentially more popular, tech giants like Amazon and Google are preoccupied with AI assistants, self-driving cars, and data-mining algorithms. Computer Science & Engineering Professor Arindam Banerjee has something different in mind—climate science and ecology.

Banerjee applies machine learning strategies to different facets of science like biology, climate change, and terrestrial ecosystem modeling. He’s working with various researchers across the University of Minnesota and the University of Wisconsin-Madison as a part of the new Biology Integration Institute (BII), for which the U of M received a $12.5 million National Science Foundation grant.

Led by College of Biological Sciences (CBS) professor Jeannine Cavender-Bares, the researchers at BII aim to use spectral biology—which involves using data gathering methods like satellite imagery and drones—to better study biodiversity.

“This is still like the old biology that we read in our textbooks, but now we want to actually be able to measure it globally in large communities and in forests,” Banerjee said.

“The measurements aren’t always going to be coming from under a microscope, they are going to be coming from the sky,” he said.

Finishing the map

Banerjee’s interest in biology began almost a decade ago, when College of Food, Agricultural, and Natural Sciences (CFANS) professor Peter Reich invited him to a research talk on the Twin Cities campus in St. Paul. The ecology researchers faced a problem in their field—lack of data. They wanted to study nitrogen and phosphorous levels in plants globally, but they only had 10 percent of the data they needed to do so. They wondered if machine learning could help fill the gap. 

Since then, Banerjee and his lab have been collaborating with Reich’s research group, developing algorithms that use the plant trait information they have to create global maps of nitrogen and phosphorous levels in leaves.

This knowledge can ultimately help scientists better determine how much carbon is in Earth’s atmosphere—without traversing every inch of the globe to gather the physical data.

Enter 2020. Banerjee was recruited to work with BII on a similar project: using machine learning algorithms to identify plant species with hyperspectral data from satellites, aircraft, and drones.

“Ideally, you want to study life sciences using genomics, looking at the molecules, biochemical processes, and so on,” Banerjee said. “But, that is very hard to measure on a large, global scale. What is easy to measure is data from satellites and from drones.”

In with the old, out with the new?

The purpose of BII is to bridge these hyperspectral measurements with traditional biology. Banerjee is working with a team at the University of Zurich in Switzerland that has collected samples of plants from the forests of several European countries.

The Zurich researchers have both the physical data from studying the specimens under a microscope and the corresponding satellite and drone images for the same plants. Banerjee’s team can then use these measurements to train their machine learning algorithm. Once the algorithm has been fed an input (satellite imagery) with a known output (plant species), it can learn to classify species based on other hyperspectral data from across the world.

“We want to understand plant biodiversity at multiple scales in a rapidly changing world,” Banerjee explained.

“The second aspect is to understand how global changes are affecting biodiversity," he said. "Is it going to be more resilient to global changes, or are places where we only have a few species going to be more harmed in the process?”

One of the ultimate goals of the project, Banerjee said, is to incorporate plant biodiversity into land surface models, which scientists use to estimate how much carbon the land pulls from the atmosphere.

“[Spectral biology] is relatively new in the field—it’s only over the last 5-10 years that this is starting to look feasible,” he said. “This will probably be one of the first times that a fully devoted team will dig in, and this will potentially transform biology and the way terrestrial biology and ecosystem modeling are done.”

Cavender-Bares (CBS), along with BII co-directors Peter Reich (CFANS) and Philip Townsend (University of Wisconsin-Madison), have worked hard to spearhead this new research perspective in the biology field. Banerjee’s project is just one facet of BII, which is made up of over a dozen researchers plus additional teams focusing on community engagement and outreach.

Learn more about these collaborative projects on the Biology Integration Institute website.

Story by Olivia Hultgren