Probability-Conserving Flow Guidance
概率守恒的流引导
Parsa Esmati, Junha Hyung, Amirhossein Dadashzadeh, Jaegul Choo, Majid Mirmehdi
AI总结 本文提出了一种概率守恒的流引导方法AdaMaG,通过分析连续方程,将引导效果分解为发散项和分数平行项,并通过时间依赖的调度和分数平行衰减来控制这两个项,从而在不增加推理成本的情况下提高生成质量并减少幻觉。
详情
扩散和基于流的生成模型在视觉合成中占据主导地位,引导将样本对齐到用户输入并提高感知质量。然而,分类器无关引导(CFG)和基于外推的方法是速度/分数的启发式线性组合,忽略了生成流形的几何结构,破坏了概率守恒,导致在强引导下样本偏离学习的流形。我们通过连续方程分析引导,并展示其效果分解为一个发散项和一个在参数化下不变的分数平行项。我们证明发散项在采样接近数据流形时结构上会发散,这促使我们采用时间依赖的调度和分数平行衰减。所得到的即插即用规则,自适应流形引导(AdaMaG),在不增加推理成本的情况下限制了这两个项。最后,我们展示大多数减少饱和或提高生成质量的实证启发式方法直接对应于我们分解中的两个项。在图像生成基准测试中,AdaMaG提高了真实感,减少了幻觉,并在高引导制度下诱导了受控的去饱和。
Diffusion and flow-based generative models dominate visual synthesis, with guidance aligning samples to user input and improving perceptual quality. However, Classifier-Free Guidance (CFG) and extrapolation-based methods are heuristic linear combinations of velocities/scores that ignore the generative manifold geometry, breaking probability conservation and driving samples off the learned manifold under strong guidance. We analyse guidance through the continuity equation and show its effect decomposes into a divergence term and a score-parallel term defined invariantly across parameterisations. We prove the divergence term blows up structurally as sampling approaches the data manifold, motivating a time-dependent schedule alongside score-parallel attenuation. The resulting plug-and-play rule, Adaptive Manifold Guidance (AdaMaG), bounds both terms at no additional inference cost. Finally, we show that most empirical heuristics for reducing saturation or improving generation quality correspond directly to the two terms in our decomposition. Across image generation benchmarks, AdaMaG improves realism, reduces hallucinations, and induces controlled desaturation in high-guidance regimes.