Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration (2301.12686v2)

Published 30 Jan 2023 in cs.LG, cs.AI, cs.CV, cs.SD, and eess.AS

Abstract: Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements. However, existing approaches require knowledge of the linear operator. In this paper, we propose GibbsDDRM, an extension of Denoising Diffusion Restoration Models (DDRM) to a blind setting in which the linear measurement operator is unknown. GibbsDDRM constructs a joint distribution of the data, measurements, and linear operator by using a pre-trained diffusion model for the data prior, and it solves the problem by posterior sampling with an efficient variant of a Gibbs sampler. The proposed method is problem-agnostic, meaning that a pre-trained diffusion model can be applied to various inverse problems without fine-tuning. In experiments, it achieved high performance on both blind image deblurring and vocal dereverberation tasks, despite the use of simple generic priors for the underlying linear operators.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. An unsupervised approach to solving inverse problems using generative adversarial networks. arXiv preprint arXiv:1805.07281, 2018.
  2. An introduction to compressive sampling. IEEE Signal Process. Mag., 25(2):21–30, 2008.
  3. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory, 52(2):489–509, 2006.
  4. Explaining the Gibbs sampler. The American Statistician, 46(3):167–174, 1992.
  5. Total variation blind deconvolution. IEEE Trans. Image Process., 7(3):370–375, 1998.
  6. ILVR: Conditioning method for denoising diffusion probabilistic models. In Proc. IEEE International Conference on Computer Vision (ICCV), pp.  14347–14356, 2021.
  7. Investigating U-Nets with various intermediate blocks for spectrogram-based singing voice separation. In Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), 2020a.
  8. StarGAN v2: Diverse image synthesis for multiple domains. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  8188–8197, 2020b.
  9. Parallel diffusion models of operator and image for blind inverse problems. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
  10. Diffusion posterior sampling for general noisy inverse problems. In Proc. International Conference on Learning Representation (ICLR), 2023b.
  11. Diffusion models beat GANs on image synthesis. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp.  8780–8794, 2021.
  12. The ace challenge — corpus description and performance evaluation. In 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.  1–5, 2015. doi: 10.1109/WASPAA.2015.7336912.
  13. Compressed sensing and robust recovery of low rank matrices. In 2008 42nd Asilomar Conference on Signals, Systems and Computers, pp.  1043–1047. IEEE, 2008.
  14. Bounds on the Jensen gap, and implications for mean-concentrated distributions. arXiv preprint arXiv:1712.05267, 2017.
  15. CNN architectures for large-scale audio classification. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.  131–135, 2017.
  16. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp.  6629–6640, 2017.
  17. Denoising diffusion probabilistic models. Proc. Advances in Neural Information Processing Systems (NeurIPS), 33:6840–6851, 2020.
  18. Interspeech 2021 deep noise suppression challenge. In Interspeech, 2021.
  19. Solving linear inverse problems using the prior implicit in a denoiser. In NeurIPS 2020 Workshop on Deep Learning and Inverse Problems, 2020.
  20. Blind deconvolution of sparse pulse sequences under a minimum distance constraint: A partially collapsed Gibbs sampler method. IEEE Trans. Signal Process., 60(6):2727–2743, 2012.
  21. Music enhancement via image translation and vocoding. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.  3124–3128. IEEE, 2022.
  22. A style-based generator architecture for generative adversarial networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  4401–4410, 2019.
  23. SNIPS: Solving noisy inverse problems stochastically. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp.  21757–21769, 2021.
  24. Denoising diffusion restoration models. In Proc. Advances in Neural Information Processing Systems (NeurIPS), 2022.
  25. Fréchet Audio Distance: A metric for evaluating music enhancement algorithms. arXiv preprint arXiv:1812.08466, 2018.
  26. Reverb conversion of mixed vocal tracks using an end-to-end convolutional deep neural network. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.  81–85. IEEE, 2021.
  27. Fast image deconvolution using hyper-Laplacian priors. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 22, 2009.
  28. Learning to push the limits of efficient FFT-based image deconvolution. In Proc. IEEE International Conference on Computer Vision (ICCV), pp.  4586–4594, 2017.
  29. Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In Proc. IEEE International Conference on Computer Vision (ICCV), pp.  8878–8887, 2019.
  30. Improving score-based diffusion models by enforcing the underlying score Fokker-Planck equation. 2022.
  31. Langevin, P. On the theory of Brownian motion. 1908.
  32. Audio bandwidth extension: application of psychoacoustics, signal processing and loudspeaker design. John Wiley & Sons, 2005.
  33. Learning representations for automatic colorization. In European conference on computer vision, pp.  577–593. Springer, 2016.
  34. Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika, 81(1):27–40, 1994.
  35. Decoupled weight decay regularization. In Proc. International Conference on Learning Representation (ICLR), 2019.
  36. Mixed precision training. In Proc. International Conference on Learning Representation (ICLR), 2018.
  37. Spectral normalization for generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2018.
  38. Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Trans. Audio, Speech, Lang. Process., 18(7):1717–1731, 2010.
  39. Improved denoising diffusion probabilistic models. In Proc. International Conference on Machine Learning (ICML), pp.  8162–8171. PMLR, 2021.
  40. Blind image deblurring using dark channel prior. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  1628–1636, 2016.
  41. Deblurring images via dark channel prior. IEEE transactions on pattern analysis and machine intelligence, 40(10):2315–2328, 2017.
  42. On aliased resizing and surprising subtleties in gan evaluation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  11410–11420, 2022.
  43. Neural blind deconvolution using deep priors. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  3341–3350, 2020.
  44. One network to solve them all–solving linear inverse problems using deep projection models. In Proc. IEEE International Conference on Computer Vision (ICCV), pp.  5888–5897, 2017.
  45. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical image computing and computer-assisted intervention, pp.  234–241, 2015.
  46. SDR – Half-baked or well done? In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp.  626–630, 2019. doi: 10.1109/ICASSP.2019.8683855.
  47. Unsupervised vocal dereverberation with diffusion-based generative models. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP). IEEE, 2023.
  48. An improved non-intrusive intelligibility metric for noisy and reverberant speech. In Proc. Int. Workshop Acoust. Signal Enhancement (IWAENC), pp.  55–59, 2014. doi: 10.1109/IWAENC.2014.6953337.
  49. The singular values of convolutional layers. In Proc. International Conference on Learning Representation (ICLR), 2019.
  50. NHSS: A speech and singing parallel database. Speech Communication, 133:9–22, 2021.
  51. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. International Conference on Machine Learning (ICML), pp.  2256–2265. PMLR, 2015.
  52. Generative modeling by estimating gradients of the data distribution. Proc. Advances in Neural Information Processing Systems (NeurIPS), 32:11895–11907, 2019.
  53. Improved techniques for training score-based generative models. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp.  12438–12448, 2020.
  54. Solving inverse problems in medical imaging with score-based generative models. In NeurIPS 2021 Workshop on Deep Learning and Inverse Problems, 2021a.
  55. Score-based generative modeling through stochastic differential equations. In Proc. International Conference on Learning Representation (ICLR), 2021b.
  56. MAXIM: Multi-axis MLP for image processing. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  5769–5780, 2022.
  57. Partially collapsed Gibbs samplers: Theory and methods. Journal of the American Statistical Association, 103(482):790–796, 2008.
  58. Solving inverse problems with a flow-based noise model. In Proc. International Conference on Machine Learning (ICML), pp.  11146–11157. PMLR, 2021.
  59. Unnatural l0 sparse representation for natural image deblurring. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  1107–1114, 2013.
  60. Semantic image inpainting with deep generative models. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  5485–5493, 2017.
  61. Multi-stage progressive image restoration. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  14821–14831, 2021.
  62. The unreasonable effectiveness of deep features as a perceptual metric. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  586–595, 2018.
  63. Image reconstruction by domain-transform manifold learning. Nature, 555(7697):487–492, 2018.
Citations (37)

Summary

  • The paper introduces GibbsDDRM, a novel method that leverages diffusion models and a partially collapsed Gibbs sampler to solve blind inverse problems.
  • The methodology efficiently samples from complex posterior distributions, improving performance in tasks like image deblurring and vocal dereverberation.
  • Experimental validation demonstrates superior perceptual quality and signal restoration when compared to traditional approaches.

GibbsDDRM: A Novel Approach for Solving Blind Inverse Problems using Diffusion Models

Introduction to Blind Inverse Problems

Blind inverse problems represent a significant challenge within various domains, notably in the fields of image and audio processing. These problems entail reconstructing original signals or images from observed data that have been distorted by an unknown linear operator. Traditional methods for tackling such challenges often rely on specific assumptions or extensive training data for distinct problems, limiting their versatility and application scope.

The Proposed Solution: GibbsDDRM

In a novel extension of Denoising Diffusion Restoration Models (DDRM), a paper introduces GibbsDDRM to address these challenges without presupposing detailed knowledge of the corrupting linear operator. This innovative method combines the strengths of pre-trained diffusion models with a partially collapsed Gibbs sampler (PCGS) to achieve remarkable results in blind settings.

GibbsDDRM constructs a joint distribution that encapsulates the data, the measurements, and the parameters of the linear operator. An efficient sampling method derived from PCGS allows the model to approximate samples from the posterior distribution of the data and the linear operator’s parameters, given the observed measurements. Notably, this approach leverages simple, generic priors for the measurement process parameters, circumventing the need for intricate prior models.

Theoretical Foundations and Experimental Validation

Providing a rigorous theoretical foundation, the model is validated through extensive experimentation, showing its efficacy in challenging scenarios like blind image deblurring and vocal dereverberation. The use of pre-trained diffusion models, key to this approach, enables the application of GibbsDDRM across diverse inverse problems without the necessity for retraining. Experimental results, particularly in blind image deblurring tasks, illustrate GibbsDDRM's superior performance, achieving high scores in terms of perceptual similarity and faithfulness to the original images. Similarly, in vocal dereverberation, GibbsDDRM outshines existing methods, offering a new benchmark in processing quality and signal restoration.

The Methodology in Detail

The methodology hinges on efficiently sampling from complex posterior distributions, leveraging pre-trained diffusion models' capabilities to model data distributions. By introducing a partially collapsed Gibbs sampler, GibbsDDRM enhances sampling efficiency, consequently improving the estimation accuracy of both the original signal and the latent parameters of the linear operator.

Implications and Future Prospects

The implications of GibbsDDRM are vast for the future of AI and machine learning, especially in solving inverse problems. This method's ability to generalize across different problems without specific tuning suggests a significant step forward in creating more adaptable and efficient AI systems.

The exploration into blind inverse problem-solving paves the way for future research, especially regarding the customization of diffusion models and sampling techniques for even broader application spectra. Furthermore, the results encourage a deeper investigation into the theoretical aspects of diffusion-based methods and their potential in addressing a wide array of inverse problems in science and engineering.

Closing Thoughts

GibbsDDRM represents a significant advancement in the field of AI, particularly in solving blind inverse problems through generative models. Its innovative use of diffusion models in combination with a tailored Gibbs sampler highlights the growing importance of adaptable and theory-driven approaches in the evolution of machine learning technologies. As research continues, GibbsDDRM's contributions will undoubtedly influence future developments in AI, pushing the boundaries of what is possible in solving complex inverse problems across various domains.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub