The Department of Computer Science & Engineering students and faculty earned five awards for their papers and posters at the 2023 Society for Industrial and Applied Mathematics (SIAM) International Conference on Data Mining (SDM). The conference was hosted in April at the Graduate Minneapolis Hotel under the leadership of multiple CS&E faculty members, including
Shashi Shekhar, Yao-Yi Chiang, Ju Sun, Jaideep Srivastava, and Vipin Kumar.
Ph.D. students Mingzhou Yang (advised by Shekhar), Jina Kim (advised by Chiang), Zekun Li (advised by Chiang), Yijun Lin (advised by Chiang), and Somya Sharma (advised by Kumar) led the papers and poster presentations that earned awards.
Additionally, the overall best paper award at the conference was awarded to the paper led by CS&E alumni Xiaowei Jia (Ph.D., 2020) and Yiqun Xie (Ph.D., 2020). Jia is an assistant professor at University of Pittsburgh and won the University-wide best dissertation award in 2022. Xie is an assistant professor at the University of Maryland.
The five winning papers are detailed below:
Authors: Mingzhou Yang, Bharat Jayaprakash, Matthew Eagon, Hyeonjung (Tari) Jung, William F. Northrop, and Shashi Shekhar
Summary: Society must achieve net zero carbon emissions to mitigate anthropogenic climate change and preserve a livable planet. Reducing transportation emissions is an important component to achieve net zero because such emissions account for a quarter of global carbon released into the environment. Driven by increasingly available transportation big data and enhanced computational speed, data mining techniques have become powerful tools to achieve transportation decarbonization. This paper describes existing gaps in transportation decarbonization research where data mining can help address problems related to medium and heavy vehicle electrification, electric micromobility safety, and analysis of alternative fuel-powered and plug-in hybrid electric vehicles. Our recommendations encompass open research problems, opportunities for data mining applications, and examples of areas where advancements in data mining techniques are needed. We encourage the data mining community to explore these challenges and opportunities to help achieve net zero emissions goals.
Author: Jina Kim
Summary: The awarded poster was about developing machine learning methods to understand human-environmental interactions by detecting meaningful patterns and exploiting multiple hidden relations from spatial data of all sources including overhead images, street views, trajectories, and spatial documents. The meaningful cues from machine learning methods will ultimately solve real-world problems such as Sustainable Development Goals from the United Nations (e.g., water, food, or public health).
Author: Zekun Li
Summary: The poster focuses on information extraction and understanding of two geospatial data forms: scanned historical maps and contemporary structured geospatial databases. It includes 1) the creation of synthetic historical maps, 2) a language model to capture the relations between 2D geo-entities and produce spatial-context-aware features and 3) a machine learning system for historical map understanding.
Authors: Yijun Lin, Tianqi Luo, Joseph Talghader, Yao-Yi Chiang, Daniel Bond
Summary: This work is a collaboration between computer science and electrical engineering. Our goal is to build a physics-guided machine learning approach to estimate the fluid dynamics of the governing system and the overall population of the microparticles by modeling the measurements extracted from partially retrieved particles.
Author: Somya Sharma
Summary: In hydrology, river flow can be impacted by changing the physical characteristics (e.g., climate, soil geology, geomorphology) of the river basins. Often times these physical characteristics are not monitored, leading to uncertainty about their impact on river flow. My paper proposes a probabilistic prediction tool that can not only estimate these physical characteristics but also provides confidence scores that give insight into the quality of predictions. We also establish that these estimates for physical characteristics can also enable improved forecasting of streamflow.