Data Driven

Data is accumulating around the world. By 2025, according to one recent study, each connected person on the planet will have at least one data interaction, from a Facebook like to a Google search, every 18 seconds. The opportunities for corporations, healthcare providers, educational institutions, governments, and other organizations to benefit from this information is enormous—if they can harness it.

Data analytics is increasingly central to the research and teaching of University of Minnesota ISyE faculty. Their interest is driven by their own curiosity, but also fueled by questions arising in industry and business. Opportunities for ISyE students to study data analytics in their ISyE coursework continues to expand, and employers are eager to hire graduates with skills in this area. "It's been a hot area and it's not cooling off," says ISyE professor Bill Cooper. "I only see the demand going up."

This article is the first in a series focused on the wide-ranging work of ISyE department faculty in the area of data analytics. 



In some ways, data analytics is merely a contemporary form of traditional operations research, where quantitative methods are used to guide decision-making within an organization. "What's changed over the past 10 to 20 years, is the scale at which these methods can be deployed," says ISyE professor Bill Cooper. "There's so much more data available, and there's significantly more computing power to take advantage of this data."

Cooper has a longstanding interest in models that help organizations with revenue management, and several of his research projects have centered around pricing in the airline industry. "Airlines were among the leaders in analyzing data, because they collected a lot of it and they had the ability to dynamically change their prices," Cooper explains. "Now, all sorts of industries have information about previous buying histories, customer demand, competitor pricing, and more."

“Now, all sorts of industries have information about previous buying histories, customer demand, competitor pricing, and more.”

—Bill Cooper, ISyE Professor 

That data, some of it provided by customers themselves, helps airlines, retailers, and others predict demand and set prices accordingly. Such projections may lead an airline, for example, to change ticket prices on a certain flight, maximizing the profit potential. 

Computing power has also grown by leaps and bounds in recent years, allowing data scientists to take advantage of more data than ever before. "The scale of problems you can address where the answer is not a formula has gotten bigger," Cooper says. "Of course, the more things you put into a model, the more you have to estimate–so 'more complicated' is not always better. But in the past, even if you had a nice model for multiple products, you had no hope of doing computation. I suppose that there will always be data sets that are too large for certain types of analyses. But as computing power increases, it is possible to do more and more."



Data analytics is also changing the face of healthcare and medicine. For example, ISyE Professor Kevin Leder recently began a collaboration with researchers at the University of Oslo to determine the best course of treatment for patients with multiple myeloma. Often, after a period of use, a drug will no longer work effectively for a patient, so providers must decide which drug to use next.

"There are a lot of drug choices," Leder says. "After the patient has gone through ten drugs, it's kind of a guess what drug should be used for the best outcome. Our goal is to create algorithms that use multiple myeloma patient history and in vitro experiments to determine a recommended course of therapy for the patient."

Data analytics has improved greatly with the advent of machine learning, Leder says. Data sets have gotten better and computing power has increased. "We're learning the answers to questions that we couldn't answer all that well before," he says. "Eventually, we hope to get to the point in medicine where we can say, yes this is the right treatment, based on data."

But the effectiveness of data analytics, of course, is based on its design. One challenge in determining the best course of treatment is the need to predict outliers. "Tumors are generally very complex and heterogeneous entities, so we have a problem where sequencing the tumor doesn't tell us the whole picture," Leder says. "Imagine a tumor has 100 billion cells and you sequence the tumor to know what drug to take. Your sequencing says take drug A, but there is a subpopulation in the tumor of size 100 million
cells (0.1% of cells) that is resistant to drug A. It is hard to imagine a way we can figure out how to identify the presence of that 'tiny' subpopulation resistant to drug A. I think we'll figure it out eventually, but right now it seems hard."



Increasingly, the data available to researchers is so overwhelming that it's hard to know what to look for at first. Fortunately, machine learning provides an opportunity to search for the proverbial "needle in the haystack."

ISyE Professor Shuzhong Zhang is among those who are hoping to discern patterns in medical data that will allow medical providers or physicians to predict conditions like lung cancer and Alzheimer's in their early stages—when intervention can significantly alter the outcome for the patient. "It's easy to gather the data, but difficult to figure out how to meaningfully compare indicators from different sources," says Zhang, whose specialty is the design and analysis of optimization algorithms.

“Data is the way to help us out of this problem.”

—Shuzhong Zhang, ISyE Professor 

Zhang is collaborating with researchers at Arkansas State University to discern patterns in DNA sequences and CT images that, taken together, can alert doctors to the likelihood of Alzheimer's. "CT scans are very large, which can be a challenge," Zhang says. "The number of variables can be in the millions, and there may be a lot of noise in the data."

DNA sequences are more straightforward, but figuring out how to use the data jointly is the ultimate—and as yet elusive—point of the project. "If we can unlock the connection, we can reinforce confidence in our predictions of early stage Alzheimer's," Zhang says. "Data is the way to help us out of this problem."

As our ability to use markers to predict disease improves data analytics my lead us to individualized medicine, Zhang says. "Perhaps someday we'll be designing specific cancer treatments for individuals," he says.

Joel Hoekstra is a Minneapolis-based writer.