Ben Crary monitors natural surface waters using big data

Ben Crary (BCE 2011, plus MS Environmental Engineering 2013 from UW Madison) relishes opportunities to get outside and see the natural surface water sites he usually works on as a Senior Principal Engineer at Hazen and Sawyer (Hazen). Hazen is a consulting firm that addresses “all things water.”® They support the people who are charged with protecting our waters. 


Alumni Ben Crary, at Hazen and Sawyer

 The St. Paul office of Hazen, just five minutes from the University of Minnesota, houses a lot of people who work on water treatment and design. Within the office, Crary is an outlier. His focus is on natural surface waters, earning him the nickname, the unprocessed water guy. He is part of Hazen’s Water Resources Group and the Source Water Protection Subject Matter Expert Group. He usually works with clients who are discharging something into the environment to ensure that the discharges meet regulations and don’t harm the environment. He also works with clients who want to protect their reservoirs or rivers that serve as drinking water sources. 

Value of AI / Big data 

Crary enjoys touring a watershed and seeing how it all comes together. “To be there and appreciate the natural environment, to float down a river in a kayak, and take the measurements yourself. I take those opportunities whenever I can. I’d love to be out there more. My personal perspective is that we need to protect our water resources. Minnesota is a region with a lot of water, a lot of good water. Not all regions are as fortunate. My plea is to keep our water clean and protect it while we can.” 

From a water resources perspective, Crary sees data as invaluable. The industry collects data to understand the limits or boundaries of a water system in an effort to protect our natural resources. More and more data is being collected every day, allowing researchers to identify new solutions and new problems. 

“I think of data science as statistics, data visualization, and coding. Each one of those skill sets requires a lot of practice and thought. Using them together, you can start to do some really great things. People who have the data savvy to understand, integrate, and leverage multiple data sets will be well positioned to be industry leaders.” 

Data Collection Tools 

Crary explains some of the data collection tools he uses.“When I say data, I’m thinking about water quantity, how much water is flowing through a river, and water quality, does it meet water quality standards? Will it serve as a sustainable water resource for humans and aquatic life? 

There are many, many parameters that we look at, a lot of these constituents can be measured in real time, like flow, dissolved oxygen, and temperature. A lot of gauges and sensors are deployed. We, the royal we, can access that information in real time to tell us about what’s going on in a particular region.

On the water quality side of things when we are looking at nutrients and other drivers, we often need to collect samples. We’ll send out teams to collect water in a jar and send it to a lab. 

There’s also remote sensing, which is an emerging area, both with drones and with satellite imagery. With those, we can start to look at things like land use surrounding water bodies, and observe short-term and long-term changes. We are getting much closer to being able to use remote sensing for algal detection on a very small scale. 

Over the last decade, satellite resolution has increased dramatically. Years ago, we might have gotten a really fuzzy image of our source water reservoir. We could see the reservoir, but couldn’t make any actionable changes because it was just not granular enough. Now some of the newer satellites are enabling finer scale management and detection of upstream issues that might be heading downstream. 

The data collection method impacts how you manage and analyze the data. In situ sensor data, like flow and temperature, are usually sent through telemetry, and so there’s a near real-time component. You could get on a website that posts that data and query or extract longer term time series or recent records from 15, 20 minutes ago. 

Lab data usually has electronic delivery forms, but in a completely separate system. Remote sensing is still a black box, but there are emerging platforms that allow you to download those images, process, and make inferences about water quality. 

We use the R and python programming language interchangeably, depending on staff skill sets and client preferences. At Hazen, we also do a lot of dashboard development for our clients with Microsoft Power BI. Between the three of those, at Hazen at least, that’s where the bulk of our visualization tools are derived, but we have other folks who focus on customizable web-based visualizations. I find that R does everything I need it to. I understand the plotting packages in R like the back of my hand now, and if there’s something I don’t know how to do at this point, it is very easy to find the answer. Once you kind of understand the language, you don’t need to understand all the nuances to add more to it. 

Data Visualization 

Crary spends a significant portion of his time developing data visualizations. He began during his research as an undergraduate with Dr. Novak and continued throughout his master’s degree at UW Madison. "I spent a lot of time learning statistics and coding simply to get to the answers of our scientific questions. As I got closer to finishing my Master’s Degree, a lot of my focus was on how to visualize what I’d learned. Keeping in mind that I was working with enormous data sets with simple inputs, I was constantly thinking about how I could take all that data and tell a story. That was a unique experience because there was such a breadth of data with stories that could have gone any number of ways. I began to think more critically about the communications 
piece of data science. It was a creative outlet, too, in some respects. I have come to enjoy that storytelling process." 

Working in the industry, Crary applied this creative mindset and found that his clients really appreciate the care and the craft that can go into data visualizations. 

“From the skill development side, it just takes time, right? You do one visualization and you learn something, then you do another. You read papers, you look at books, you see other examples and apply those lessons to what you’re working on.”

Internal Tools

On the data science side, one of the ongoing projects Crary is working on is an internal tool called StormSight. 

A lot of his clients across the country wonder how precipitation patterns are changing. They want to look at what happened in a particular storm and understand how it compared to previous storms. They have questions about long-term trends, about changing patterns of storms. So Hazen built an internal tool, an analysis dashboard that integrates a lot of different data sets. It integrates precipitation data from federal, private, and other agency sources, with all the other measurements they can find, and combines the data sets with a process that analyzes the data and creates a dashboard environment. It shows recent events in a historical context compared to long­term trends, and helps his clients understand how to factor climate uncertainty into their design decisions.

Necessary Discussions 

Crary, like many practitioners, sees AI tools as a great resource and a great first step when doing some research.  He acknowledges there could be a risk in that AI and assistant-type systems like Copilot or ChatGPT can often answer incorrectly. “Especially in our technical field,” he states. “Perhaps because they haven’t been ‘trained adequately,’ so to speak, or maybe we’re asking questions that aren’t yet answerable. But there is a risk that unskilled adoption of these tools could lead to subpar or declining quality of work products or understanding of water systems.” 

Crary notes that traditional design and process engineering has included data intensive processes. And there are already cases where new data intensive processes have been applied successfully (like predictive warning systems for flooding, algal blooms, or real-time management of systems). Yet the industry is still figuring out how to apply some remote sensing and machine learning tools. Crary is confident that the industries will be able to integrate these new tools with ethics and expertise. 

Still learning

“A couple years ago, I took a training course with Dr. Scott Wells at Portland State; he manages an open source water quality model. Kind of laughably, he listed about 18 things we need to be an expert in to model water quality: hydrology, microbiology, phytoplankton, physics—it was a never-ending list of things! The take-home message was to build a good team of folks who can do this together. 

"I am trying to learn all of those things by picking up pieces from others as I go. Integrating all the operational, physical, and bio-geochemical processes in our environment into a predictive tool can be incredibly valuable as we try to evaluate any number of changes that may impact our water resources.” 

Hazen is exploring ways to better understand how machine learning, artificial intelligence, and all of these tools that are becoming more powerful, will integrate into the management of our systems. Hazen is investing in people like Ben Crary by providing time to research these topics, conceptualize use cases, and make proposals for internal and client tools. 

Share