Seongjin Choi's research relies on big data
On January 29 of this year, a plane and a helicopter crashed into one another over the Potomac River in Washington D.C., killing 67 people. Immediately there were calls for action to avoid such crashes. One solution was to separate plane traffic from helicopter traffic. While that could work for an immediate cure, the incident highlights a growing problem. Increasing numbers of people and vehicles moving about in greater and greater numbers, and the likelihood of collisions and close calls rising as density increases. Seongjin Choi is working on other solutions.

Seongjin Choi, an assistant professor of transportation engineering, studies urban mobility. He is driven to develop safe, efficient, and sustainable transportation systems in urban settings. In the process, he does research on cars, trucks, airplanes, pedestrians, and how all these systems work together. To deal with these large and complex systems, Choi relies on big data.
Choi’s broad and interdisciplinary interests encompass urban mobility data analytics, spatiotemporal data modeling, deep learning and artificial intelligence, connected automated vehicles (CAV), and cooperative-intelligent transportation systems (ITS). His current focus is creating predictive models for things like traffic, motion, and trajectories. His next goal is to use predictive models to build systems that can proactively respond in order to avert incidents before they happen.
Choi draws insights into urban mobility through data analytics. Having access to large amounts of data significantly broadens the complexity of problems that researchers can address and the scope of insights researchers can derive. Advancements are spurred by the sharing of data (so called open data) along with many researchers competing to see who can handle data faster and better.
Big data is generally thought of as huge collections of data that can overwhelm traditional data processing tools. New tools are being developed to handle very large amounts of information. The field is evolving rapidly—almost every year, new methodologies and new technologies emerge. Advancements like machine learning and deep learning enable more accurate, adaptive, real-time decision-making for our future transportation systems.
Choi finds keeping up with the latest trends challenging, but also exciting. He finds this moment in transportation research to be opportune. “It is exciting to be part of this change. I am discovering what I can do with so much data,” says Choi. “Yet, we need not only big data (lots of situations to analyze), we need also good data (data that is not missing information and that covers a wide spectrum of situations that we want to study). To design safe, efficient systems, we also need smarter strategies for collecting high-quality, relevant data.” He finds the need for good data especially critical when designing predictive systems for safety-critical applications, such as those that would drive autonomous vehicles or actively avoid collisions at airports, on highways, or in busy city intersections. Choi is optimistic about the sources and data available, primarily for the promise it brings for increasing the safety of urban transportation systems.
One example would be an intersection monitoring system that could help protect vulnerable road users (e.g., pedestrians, cyclists, road workers, elderly people, people with disabilities, etc.). Another example would be controllers for self-driving vehicles. In these instances, safety critical situations are infrequent in observed data, the events show up rarely and may not appear at all in a small set of observations/measurements. However, safety critical situations are the very reason for building these systems. It is critical that safety systems are built to respond and built to handle rare situations that require a safety response. Developers need lots of data to “teach” the smart systems (through machine learning) how to identify and respond to safety critical situations.
One recent project for Choi and his colleagues involves predicting trajectories. That is, for an intended trajectory (for example, a car making a left turn), how accurately can models predict the exact path that vehicle will take through the intersection? How widely might the vehicle turn? How quickly? How far out into the intersection might the arc of that turn reach? Planning for the worst-case scenario is safe, but if developers could be more precise in their predictions, they could build more efficient traffic systems in addition to safer ones.
Researchers map this problem out by creating a model of an intersection, and then pixel by pixel, calculating the probability that each of those pixels might be occupied as a vehicle goes through a left turn. In data speak, a trajectory might be described as a sequence of location and timestamps of a moving object.
In transportation engineering, years of trajectory data has been collected through observations and sensors. That data is used to analyze travel behavior and dynamics of movements for vehicles and pedestrians and to predict where an agent will go next. That analysis must be repeated for each agent (vehicle, pedestrian, etc.) and each movement through a defined space. From that vast amount of collected data, developers strive to build proactive models that can respond to actual movements, avoiding collisions.
Choi also applies his research method to airspace and movement of planes. Congested airspace is a particularly pressing problem in Seoul, South Korea. Beginning with his Ph.D. research, Choi analyzed trajectory data from an airport in Seoul. That data set had approximately 60,000 possible trajectories for airplanes taking off from a very busy runway. At first, Choi mainly studied travel behavior using macroscopic trajectories, that is, looking at when and where the trips started and ended, the route taken, and what affected decisions.
In a recent publication, Jungwoo Cho and Seongjin Choi proposed a framework to assess the feasibility of Urban Air Mobility (UAM), a new route integration program using probabilistic aircraft trajectory prediction. They applied the methodology to the airspace over Seoul, encompassing interactions between air traffic (planes) and conventional traffic (cars and trucks) at multiple altitudes and lanes. The proposed framework predicts short-term trajectory distributions of conventional aircraft, enabling planes to dynamically adjust speeds and maintain safe separations. The results reveal that different physical locations of lanes and routes experience varying patterns of interaction and varying encounter dynamics.
Integrating Urban Air Mobility (UAM) into airspace managed by Air Traffic Control poses significant challenges, particularly in congested terminal environments, a facet in Choi’s research. Limited trajectory data for departing aircraft in this region occasionally led to tighter separations and increased operational challenges. The study by Cho and Choi underscores the potential of predictive modeling in facilitating UAM integration while highlighting critical trade-offs between safety and efficiency. The findings contribute to refining airspace management strategies and offer insights for scaling UAM operations in complex urban environments.
More recently, Choi has been exploring microscopic trajectories, looking more closely at trajectories of an individual vehicle or pedestrian, and predicting the probability of their next location. A great challenge for transportation researchers is that transportation systems and intelligent agents (like autonomous vehicles) must understand the future motion of traffic participants to effectively plan motion trajectories. At the same time, the future motion of traffic participants is inherently uncertain. In another recent study, Choi introduces TrajFlow, a probabilistic framework for modeling occupancy densities in a road traffic scenario. TrajFlow estimates the likelihood that a particular individual vehicle or pedestrian (in the scope of this study) will be at a specific location at a given time. The TrajFlow framework developed by Choi and his student utilizes a causal encoder to extract semantically meaningful embeddings of the observed trajectory, as well as a normalizing flow to decode these embeddings and determine the most likely future location of traffic participants at some time point in the future. TrajFlow’s predictions are conditioned on a sequence of observed historical locations. This method can be extended to other moving objects, such as aircraft, hurricanes, typhoons, or debris.
The formulation used in TrajFlow differs from existing approaches because the marginal distribution of spatial locations is modeled instead of the joint distribution of unobserved trajectories. The advantages of a marginal formulation are numerous. Choi and his student have demonstrated that the marginal formulation produces higher accuracy on challenging trajectory forecasting benchmarks. It also allows for a fully continuous sampling of future locations. Finally, marginal densities are better suited for downstream tasks (for example, collision warning or trajectory planning as they allow for the computation of per-agent motion trajectories and occupancy grids, the two most commonly used representations for motion forecasting.
Choi is using big data to improve safety and advance our transportation networks. CEGE is thrilled to see what he and his students are able to accomplish.