Evolution of Diffusion Models: From Birth to Enhanced Efficiency and Controllability

Industrial Problems Seminar

Abstract

Diffusion models excel in high-fidelity data generation across various domains but face challenges of slow sampling and fixed resolution. This talk introduces diffusion model fundamentals and methods to boost training and sampling efficiency by leveraging a unified "consistency" concept rooted in the models’ mathematical structure.

To speed up sampling, I will present the Consistency Trajectory Model (CTM), which compresses a pre-trained diffusion model into a single network. CTM computes scores (log-density gradients) in one forward pass, enabling efficient traversal along the Probability Flow ODE and supporting novel deterministic and stochastic sampling methods, including long jumps along ODE paths.

To overcome resolution constraints, I will introduce Progressive Growing of Diffusion Autoencoder (PaGoDA), which extends a generator’s resolution beyond that of a pre-trained diffusion model. By encoding high-resolution data into a structured latent space, PaGoDA incrementally increases the decoder’s resolution, enhancing efficiency without retraining during upsampling.

If time permits, I will discuss applications in controllable generation and media restoration for solving inverse problems.

This recording was created before the current policy requirements took effect, and therefore may not be accessible. To request this content in an accessible format, contact [email protected].

Evolution of Diffusion Models: From Birth to Enhanced Efficiency and Controllability

Abstract

Share