Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OMS-DPM: Optimizing the Model Schedule for Diffusion Probabilistic Models (2306.08860v1)

Published 15 Jun 2023 in cs.LG

Abstract: Diffusion probabilistic models (DPMs) are a new class of generative models that have achieved state-of-the-art generation quality in various domains. Despite the promise, one major drawback of DPMs is the slow generation speed due to the large number of neural network evaluations required in the generation process. In this paper, we reveal an overlooked dimension -- model schedule -- for optimizing the trade-off between generation quality and speed. More specifically, we observe that small models, though having worse generation quality when used alone, could outperform large models in certain generation steps. Therefore, unlike the traditional way of using a single model, using different models in different generation steps in a carefully designed \emph{model schedule} could potentially improve generation quality and speed \emph{simultaneously}. We design OMS-DPM, a predictor-based search algorithm, to optimize the model schedule given an arbitrary generation time budget and a set of pre-trained models. We demonstrate that OMS-DPM can find model schedules that improve generation quality and speed than prior state-of-the-art methods across CIFAR-10, CelebA, ImageNet, and LSUN datasets. When applied to the public checkpoints of the Stable Diffusion model, we are able to accelerate the sampling by 2$\times$ while maintaining the generation quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. ediffi: Text-to-image diffusion models with an ensemble of expert denoisers. arXiv preprint arXiv:2211.01324, 2022.
  2. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022.
  3. Wavegrad: Estimating gradients for waveform generation. arXiv preprint arXiv:2009.00713, 2020.
  4. Perception prioritized training of diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11472–11481, 2022.
  5. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  6. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1):1997–2017, 2019.
  7. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  8. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  9. Video diffusion models. arXiv preprint arXiv:2204.03458, 2022.
  10. Population based training of neural networks. arXiv preprint arXiv:1711.09846, 2017.
  11. Subspace diffusion generative models. arXiv preprint arXiv:2205.01490, 2022.
  12. Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation. arXiv preprint arXiv:2106.05527, 2021.
  13. Understanding the diffusion objective as a weighted integral of elbos. arXiv preprint arXiv:2303.00848, 2023.
  14. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  15. Diffwave: A versatile diffusion model for audio synthesis. arXiv preprint arXiv:2009.09761, 2020.
  16. Srdiff: Single image super-resolution with diffusion probabilistic models. Neurocomputing, 479:47–59, 2022.
  17. Microsoft coco: Common objects in context. In European conference on computer vision, pp.  740–755. Springer, 2014.
  18. Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022.
  19. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927, 2022.
  20. Neural architecture optimization. Advances in neural information processing systems, 31, 2018.
  21. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  2837–2845, 2021.
  22. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pp. 8162–8171. PMLR, 2021.
  23. A generic graph-based neural architecture encoding scheme for predictor-based nas. In European Conference on Computer Vision, pp.  189–204. Springer, 2020.
  24. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp.  4780–4789, 2019.
  25. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10684–10695, 2022.
  26. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pp.  234–241. Springer, 2015.
  27. Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  28. Sen, P. K. Estimates of the regression coefficient based on kendall’s tau. Journal of the American statistical association, 63(324):1379–1389, 1968.
  29. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp. 2256–2265. PMLR, 2015.
  30. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  31. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019.
  32. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  33. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021.
  34. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  35. Learning to efficiently sample from diffusion probabilistic models. arXiv preprint arXiv:2106.03802, 2021.
  36. Learning fast samplers for diffusion models by differentiating through sample quality. In International Conference on Learning Representations, 2022.
  37. Oboe: Collaborative filtering for automl model selection. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.  1173–1183, 2019.
  38. Diffusion probabilistic model made slim. arXiv preprint arXiv:2211.17106, 2022.
  39. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902, 2022.
  40. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578, 2016.
Citations (27)

Summary

We haven't generated a summary for this paper yet.