Diffusion refers to the process by which particles, information, or energy spread from regions of high concentration to low concentration. In machine learning, diffusion models simulate this process to generate data, such as images or audio, by iteratively denoising random noise. These models have achieved state-of-the-art results in generative AI, particularly for image synthesis. Diffusion also plays a role in physics, chemistry, and materials science, where it describes natural transport phenomena. Its mathematical foundations make it a bridge between physical modeling and computational generation.

Posts

AutoScape: Geometry-Consistent Long-Horizon Scene Generation

This paper proposes AutoScape, a long-horizon driving scene generation framework. At its core is a novel RGB-D diffusion model that iteratively generates sparse, geometrically consistent keyframes, serving as reliable anchors for the scenes appearance and geometry. To maintain long-range geometric consistency, the model 1) jointly handles image and depth in a shared latent space, 2) explicitly conditions on the existing scene geometry (i.e., rendered point clouds) from previously generated keyframes, and 3) steers the sampling process with a warp-consistent guidance. Given high-quality RGB-D keyframes, a video diffusion model then interpolates between them to produce dense nd coherent video frames. AutoScape generates realistic and geometrically consistent driving videos of over 20 seconds, improving the long-horizon FID and FVD scores over the prior state-of-the-art by 48.6% and 43.0%, respectively.