Sampling Data with Chains of Forward-Backward Diffusion Steps
通过前向-反向扩散步骤链采样数据
Hyunmo Kang, Noam Itzhak Levi, Corinna Elena Wegner, Daniel J. Korchinski, Matthieu Wyart
AI总结 提出U-turn链,通过扩散模型的短前向-反向步骤迭代构造马尔可夫链,结合Metropolis-Hastings校正从能量修正目标中采样,并发现最小U-turn动力学经历由数据流形碎片化驱动的遍历性破缺相变。
详情
从学习到的高维分布中采样是一个基础的计算问题。我们引入U-turn链:通过迭代扩散模型的短前向-反向步骤获得的马尔可夫链,其中每一步提出一个保持在所学数据流形上的移动,并与Metropolis-Hastings校正配对,从能量修正目标中采样。对于合成语言,我们表明最小U-turn动力学经历由数据流形碎片化驱动的遍历性破缺相变;在更大的U-turn幅度下遍历性得以恢复。在非遍历区域,低层特征比高层特征松弛得更快,这种顺序仅在足够大的U-turn幅度下才会反转。我们在自然语言和自然图像上测试这些预测。在两种模态中,最小U-turn松弛缓慢,尤其是对于由CNN或LLM中深层表示近似的高层特征。层序反转仅在噪声足够大且混合高效时出现——这些特征与强约束、弱混合的局部动力学一致。我们讨论了这些结果对使用扩散模型采样的启示。
Sampling from learned high-dimensional distributions is a foundational computational problem. We introduce U-turn chains: Markov chains obtained by iterating short forward-backward steps of a diffusion model, in which each step proposes a move that remains on the learned data manifold and, paired with a Metropolis-Hastings correction, samples from energy-modified targets. For synthetic languages, we show that minimal U-turn dynamics undergoes an ergodicity-breaking phase transition driven by fragmentation of the data manifold; ergodicity is restored at larger U-turn magnitude. In the non-ergodic regime, low-level features relax faster than high-level ones, an ordering that inverts only at sufficiently large U-turn magnitude. We test these predictions on natural language and natural images. In both modalities, minimal U-turns relax slowly, especially for high-level features approximated by deep representations in CNNs or LLMs. The layer-ordering inversion appears only at large noise when mixing is efficient -- signatures consistent with strongly constrained, weakly mixing local dynamics. We discuss the implications of these results for sampling with diffusion models.