Evolution of Diffusion Models: From Birth to Enhanced Efficiency and Controllability

Industrial Problems Seminar

Chieh-Hsin (Jesse) Lai
Sony

Abstract

Diffusion models excel in high-fidelity data generation across various domains but face challenges of slow sampling and fixed resolution. This talk introduces diffusion model fundamentals and methods to boost training and sampling efficiency by leveraging a unified "consistency" concept rooted in the models’ mathematical structure.

To speed up sampling, I will present the Consistency Trajectory Model (CTM), which compresses a pre-trained diffusion model into a single network. CTM computes scores (log-density gradients) in one forward pass, enabling efficient traversal along the Probability Flow ODE and supporting novel deterministic and stochastic sampling methods, including long jumps along ODE paths.

To overcome resolution constraints, I will introduce Progressive Growing of Diffusion Autoencoder (PaGoDA), which extends a generator’s resolution beyond that of a pre-trained diffusion model. By encoding high-resolution data into a structured latent space, PaGoDA incrementally increases the decoder’s resolution, enhancing efficiency without retraining during upsampling.

If time permits, I will discuss applications in controllable generation and media restoration for solving inverse problems.

Start date
Friday, Nov. 15, 2024, 1:25 p.m.
End date
Friday, Nov. 15, 2024, 2:25 p.m.
Location

Lind Hall 325 or Zoom

Zoom registration

Share