Evolution of Diffusion Models: From Birth to Enhanced Efficiency and Controllability
Industrial Problems Seminar
Abstract
Diffusion models excel in high-fidelity data generation across various domains but face challenges of slow sampling and fixed resolution. This talk introduces diffusion model fundamentals and methods to boost training and sampling efficiency by leveraging a unified "consistency" concept rooted in the models’ mathematical structure.
To speed up sampling, I will present the Consistency Trajectory Model (CTM), which compresses a pre-trained diffusion model into a single network. CTM computes scores (log-density gradients) in one forward pass, enabling efficient traversal along the Probability Flow ODE and supporting novel deterministic and stochastic sampling methods, including long jumps along ODE paths.
To overcome resolution constraints, I will introduce Progressive Growing of Diffusion Autoencoder (PaGoDA), which extends a generator’s resolution beyond that of a pre-trained diffusion model. By encoding high-resolution data into a structured latent space, PaGoDA incrementally increases the decoder’s resolution, enhancing efficiency without retraining during upsampling.
If time permits, I will discuss applications in controllable generation and media restoration for solving inverse problems.