Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization (2402.01965v3)
Abstract: Diffusion models are gaining widespread use in cutting-edge image, video, and audio generation. Score-based diffusion models stand out among these methods, necessitating the estimation of score function of the input data distribution. In this study, we present a theoretical framework to analyze two-layer neural network-based diffusion models by reframing score matching and denoising score matching as convex optimization. We prove that training shallow neural networks for score prediction can be done by solving a single convex program. Although most analyses of diffusion models operate in the asymptotic setting or rely on approximations, we characterize the exact predicted score function and establish convergence results for neural network-based diffusion models with finite data. Our results provide a precise characterization of what neural network-based diffusion models learn in non-asymptotic settings.
- Chewi, S. Log-concave sampling, 2023. URL https://chewisinho.github.io/main.pdf.
- Dalalyan, A. S. Theoretical guarantees for approximate sampling from smooth and log-concave densities, 2016.
- Sampling from a strongly log-concave distribution with the unadjusted langevin algorithm. arXiv: Statistics Theory, 2016a. URL https://api.semanticscholar.org/CorpusID:124591590.
- Non-asymptotic convergence analysis for the unadjusted langevin algorithm, 2016b.
- Revealing the structure of deep neural networks via convex duality. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 3004–3014. PMLR, 18–24 Jul 2021.
- Globally optimal training of neural networks with threshold activation functions, 2023.
- Denoising diffusion probabilistic models, 2020.
- Hyvärinen, A. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(24):695–709, 2005. URL http://jmlr.org/papers/v6/hyvarinen05a.html.
- On the generalization properties of diffusion models, 2023.
- Estimation of high-dimensional graphical models using regularized score matching, 2016.
- Fast convex optimization for two-layer relu networks: Equivalent model classes and cone decompositions, 2022.
- The implicit bias of minima stability: A view from function space. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=2STmSnZAEt2.
- The implicit bias of minima stability in multivariate shallow relu networks, 2023.
- Pidstrigach, J. Score-based generative models detect manifolds, 2022.
- Neural networks are convex regularizers: Exact polynomial-time convex optimization formulations for two-layer networks, 2020.
- Vector-output relu neural network problems are copositive programs: Convex analysis of two layer networks and polynomial-time algorithms, 2021.
- Deep unsupervised learning using nonequilibrium thermodynamics, 2015.
- Denoising diffusion implicit models, 2022.
- Sliced score matching: A scalable approach to density and score estimation, 2019.
- Score-based generative modeling through stochastic differential equations, 2021.
- Stanley, R. P. et al. An introduction to hyperplane arrangements. Geometric combinatorics, 13(389-496):24, 2004.
- Vincent, P. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
- The convex geometry of backpropagation: Neural network gradient flows converge to extreme points of the dual convex program, 2021.
- Ar-diffusion: Auto-regressive diffusion model for text generation, 2023.
- On the generalization of diffusion model, 2023.
- Diffusion probabilistic models generalize when they fail to memorize. In ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling, 2023. URL https://openreview.net/forum?id=shciCbSk9h.
- How do minimum-norm shallow denoisers look in function space?, 2023.
- A survey on audio diffusion models: Text to speech synthesis and enhancement in generative ai, 2023.