Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale (2312.07586v5)

Published 11 Dec 2023 in cs.CV, cs.AI, cs.LG, and physics.data-an

Abstract: Popular guidance for denoising diffusion probabilistic model (DDPM) linearly combines distinct conditional models together to provide enhanced control over samples. However, this approach overlooks nonlinear effects that become significant when guidance scale is large. To address this issue, we propose characteristic guidance, a guidance method that provides first-principle non-linear correction for classifier-free guidance. Such correction forces the guided DDPMs to respect the Fokker-Planck (FP) equation of diffusion process, in a way that is training-free and compatible with existing sampling methods. Experiments show that characteristic guidance enhances semantic characteristics of prompts and mitigate irregularities in image generation, proving effective in diverse applications ranging from simulating magnet phase transitions to latent space sampling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Anderson, B. D. O. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12:313–326, 1982. URL https://api.semanticscholar.org/CorpusID:3897405.
  2. Universal guidance for diffusion models. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp.  843–852, 2023. URL https://api.semanticscholar.org/CorpusID:256846836.
  3. Improving image generation with better captions, 2023.
  4. On investigating the conservative property of score-based generative models. In International Conference on Machine Learning, 2022. URL https://api.semanticscholar.org/CorpusID:256105781.
  5. Diffusion models beat gans on image synthesis. ArXiv, abs/2105.05233, 2021. URL https://api.semanticscholar.org/CorpusID:234357997.
  6. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Neural Information Processing Systems, 2017. URL https://api.semanticscholar.org/CorpusID:326772.
  7. Ho, J. Classifier-free diffusion guidance. ArXiv, abs/2207.12598, 2022. URL https://api.semanticscholar.org/CorpusID:249145348.
  8. Denoising diffusion probabilistic models. ArXiv, abs/2006.11239, 2020. URL https://api.semanticscholar.org/CorpusID:219955663.
  9. Improving sample quality of diffusion models using self-attention guidance. ArXiv, abs/2210.00939, 2022. URL https://api.semanticscholar.org/CorpusID:252683688.
  10. Kardar, M. Statistical fields, pp.  19–34. Cambridge University Press, 2007. doi: 10.1017/CBO9780511815881.003.
  11. Elucidating the design space of diffusion-based generative models. ArXiv, abs/2206.00364, 2022. URL https://api.semanticscholar.org/CorpusID:249240415.
  12. Refining generative process with discriminator guidance in score-based diffusion models. In International Conference on Machine Learning, 2022. URL https://api.semanticscholar.org/CorpusID:254096299.
  13. Fp-diffusion: Improving score-based diffusion models by enforcing the underlying score fokker-planck equation. In International Conference on Machine Learning, 2022. URL https://api.semanticscholar.org/CorpusID:259164358.
  14. Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp.  1207–1216, Stanford, CA, 2000. Morgan Kaufmann.
  15. Pseudo numerical methods for diffusion models on manifolds. ArXiv, abs/2202.09778, 2022. URL https://api.semanticscholar.org/CorpusID:247011732.
  16. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models. ArXiv, abs/2211.01095, 2022. URL https://api.semanticscholar.org/CorpusID:253254916.
  17. Luo, C. Understanding diffusion models: A unified perspective. ArXiv, abs/2208.11970, 2022. URL https://api.semanticscholar.org/CorpusID:251799923.
  18. High-resolution image synthesis with latent diffusion models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  10674–10685, 2021. URL https://api.semanticscholar.org/CorpusID:245335280.
  19. Photorealistic text-to-image diffusion models with deep language understanding. ArXiv, abs/2205.11487, 2022. URL https://api.semanticscholar.org/CorpusID:248986576.
  20. Improved techniques for training gans. ArXiv, abs/1606.03498, 2016. URL https://api.semanticscholar.org/CorpusID:1687220.
  21. Deep unsupervised learning using nonequilibrium thermodynamics. ArXiv, abs/1503.03585, 2015. URL https://api.semanticscholar.org/CorpusID:14888175.
  22. Denoising diffusion implicit models. ArXiv, abs/2010.02502, 2020a. URL https://api.semanticscholar.org/CorpusID:222140788.
  23. Generative modeling by estimating gradients of the data distribution. In Neural Information Processing Systems, 2019. URL https://api.semanticscholar.org/CorpusID:196470871.
  24. Score-based generative modeling through stochastic differential equations. ArXiv, abs/2011.13456, 2020b. URL https://api.semanticscholar.org/CorpusID:227209335.
  25. On the theory of the brownian motion. Physical Review, 36:823–841, 1930. URL https://api.semanticscholar.org/CorpusID:121805391.
  26. Vincent, P. A connection between score matching and denoising autoencoders. Neural Computation, 23:1661–1674, 2011. URL https://api.semanticscholar.org/CorpusID:5560643.
  27. Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal., 49:1715–1735, 2011. URL https://api.semanticscholar.org/CorpusID:4527816.
  28. Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 2022. URL https://api.semanticscholar.org/CorpusID:252070859.
  29. Freedom: Training-free energy-guided conditional diffusion model. ArXiv, abs/2303.09833, 2023. URL https://api.semanticscholar.org/CorpusID:257622962.
  30. Shiftddpms: Exploring conditional diffusion models by shifting diffusion trajectories. ArXiv, abs/2302.02373, 2023. URL https://api.semanticscholar.org/CorpusID:256615850.
  31. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. ArXiv, abs/2302.04867, 2023. URL https://api.semanticscholar.org/CorpusID:256697404.
Citations (1)

Summary

We haven't generated a summary for this paper yet.