GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration (2301.12686v2)
Abstract: Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements. However, existing approaches require knowledge of the linear operator. In this paper, we propose GibbsDDRM, an extension of Denoising Diffusion Restoration Models (DDRM) to a blind setting in which the linear measurement operator is unknown. GibbsDDRM constructs a joint distribution of the data, measurements, and linear operator by using a pre-trained diffusion model for the data prior, and it solves the problem by posterior sampling with an efficient variant of a Gibbs sampler. The proposed method is problem-agnostic, meaning that a pre-trained diffusion model can be applied to various inverse problems without fine-tuning. In experiments, it achieved high performance on both blind image deblurring and vocal dereverberation tasks, despite the use of simple generic priors for the underlying linear operators.
- An unsupervised approach to solving inverse problems using generative adversarial networks. arXiv preprint arXiv:1805.07281, 2018.
- An introduction to compressive sampling. IEEE Signal Process. Mag., 25(2):21–30, 2008.
- Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory, 52(2):489–509, 2006.
- Explaining the Gibbs sampler. The American Statistician, 46(3):167–174, 1992.
- Total variation blind deconvolution. IEEE Trans. Image Process., 7(3):370–375, 1998.
- ILVR: Conditioning method for denoising diffusion probabilistic models. In Proc. IEEE International Conference on Computer Vision (ICCV), pp. 14347–14356, 2021.
- Investigating U-Nets with various intermediate blocks for spectrogram-based singing voice separation. In Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), 2020a.
- StarGAN v2: Diverse image synthesis for multiple domains. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8188–8197, 2020b.
- Parallel diffusion models of operator and image for blind inverse problems. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
- Diffusion posterior sampling for general noisy inverse problems. In Proc. International Conference on Learning Representation (ICLR), 2023b.
- Diffusion models beat GANs on image synthesis. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp. 8780–8794, 2021.
- The ace challenge — corpus description and performance evaluation. In 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5, 2015. doi: 10.1109/WASPAA.2015.7336912.
- Compressed sensing and robust recovery of low rank matrices. In 2008 42nd Asilomar Conference on Signals, Systems and Computers, pp. 1043–1047. IEEE, 2008.
- Bounds on the Jensen gap, and implications for mean-concentrated distributions. arXiv preprint arXiv:1712.05267, 2017.
- CNN architectures for large-scale audio classification. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp. 131–135, 2017.
- GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 6629–6640, 2017.
- Denoising diffusion probabilistic models. Proc. Advances in Neural Information Processing Systems (NeurIPS), 33:6840–6851, 2020.
- Interspeech 2021 deep noise suppression challenge. In Interspeech, 2021.
- Solving linear inverse problems using the prior implicit in a denoiser. In NeurIPS 2020 Workshop on Deep Learning and Inverse Problems, 2020.
- Blind deconvolution of sparse pulse sequences under a minimum distance constraint: A partially collapsed Gibbs sampler method. IEEE Trans. Signal Process., 60(6):2727–2743, 2012.
- Music enhancement via image translation and vocoding. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp. 3124–3128. IEEE, 2022.
- A style-based generator architecture for generative adversarial networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4401–4410, 2019.
- SNIPS: Solving noisy inverse problems stochastically. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp. 21757–21769, 2021.
- Denoising diffusion restoration models. In Proc. Advances in Neural Information Processing Systems (NeurIPS), 2022.
- Fréchet Audio Distance: A metric for evaluating music enhancement algorithms. arXiv preprint arXiv:1812.08466, 2018.
- Reverb conversion of mixed vocal tracks using an end-to-end convolutional deep neural network. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp. 81–85. IEEE, 2021.
- Fast image deconvolution using hyper-Laplacian priors. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 22, 2009.
- Learning to push the limits of efficient FFT-based image deconvolution. In Proc. IEEE International Conference on Computer Vision (ICCV), pp. 4586–4594, 2017.
- Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In Proc. IEEE International Conference on Computer Vision (ICCV), pp. 8878–8887, 2019.
- Improving score-based diffusion models by enforcing the underlying score Fokker-Planck equation. 2022.
- Langevin, P. On the theory of Brownian motion. 1908.
- Audio bandwidth extension: application of psychoacoustics, signal processing and loudspeaker design. John Wiley & Sons, 2005.
- Learning representations for automatic colorization. In European conference on computer vision, pp. 577–593. Springer, 2016.
- Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika, 81(1):27–40, 1994.
- Decoupled weight decay regularization. In Proc. International Conference on Learning Representation (ICLR), 2019.
- Mixed precision training. In Proc. International Conference on Learning Representation (ICLR), 2018.
- Spectral normalization for generative adversarial networks. In Proc. International Conference on Learning Representation (ICLR), 2018.
- Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Trans. Audio, Speech, Lang. Process., 18(7):1717–1731, 2010.
- Improved denoising diffusion probabilistic models. In Proc. International Conference on Machine Learning (ICML), pp. 8162–8171. PMLR, 2021.
- Blind image deblurring using dark channel prior. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1628–1636, 2016.
- Deblurring images via dark channel prior. IEEE transactions on pattern analysis and machine intelligence, 40(10):2315–2328, 2017.
- On aliased resizing and surprising subtleties in gan evaluation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11410–11420, 2022.
- Neural blind deconvolution using deep priors. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3341–3350, 2020.
- One network to solve them all–solving linear inverse problems using deep projection models. In Proc. IEEE International Conference on Computer Vision (ICCV), pp. 5888–5897, 2017.
- U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical image computing and computer-assisted intervention, pp. 234–241, 2015.
- SDR – Half-baked or well done? In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), pp. 626–630, 2019. doi: 10.1109/ICASSP.2019.8683855.
- Unsupervised vocal dereverberation with diffusion-based generative models. In Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP). IEEE, 2023.
- An improved non-intrusive intelligibility metric for noisy and reverberant speech. In Proc. Int. Workshop Acoust. Signal Enhancement (IWAENC), pp. 55–59, 2014. doi: 10.1109/IWAENC.2014.6953337.
- The singular values of convolutional layers. In Proc. International Conference on Learning Representation (ICLR), 2019.
- NHSS: A speech and singing parallel database. Speech Communication, 133:9–22, 2021.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. International Conference on Machine Learning (ICML), pp. 2256–2265. PMLR, 2015.
- Generative modeling by estimating gradients of the data distribution. Proc. Advances in Neural Information Processing Systems (NeurIPS), 32:11895–11907, 2019.
- Improved techniques for training score-based generative models. In Proc. Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp. 12438–12448, 2020.
- Solving inverse problems in medical imaging with score-based generative models. In NeurIPS 2021 Workshop on Deep Learning and Inverse Problems, 2021a.
- Score-based generative modeling through stochastic differential equations. In Proc. International Conference on Learning Representation (ICLR), 2021b.
- MAXIM: Multi-axis MLP for image processing. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5769–5780, 2022.
- Partially collapsed Gibbs samplers: Theory and methods. Journal of the American Statistical Association, 103(482):790–796, 2008.
- Solving inverse problems with a flow-based noise model. In Proc. International Conference on Machine Learning (ICML), pp. 11146–11157. PMLR, 2021.
- Unnatural l0 sparse representation for natural image deblurring. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1107–1114, 2013.
- Semantic image inpainting with deep generative models. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5485–5493, 2017.
- Multi-stage progressive image restoration. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14821–14831, 2021.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 586–595, 2018.
- Image reconstruction by domain-transform manifold learning. Nature, 555(7697):487–492, 2018.