Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling (2209.08004v2)
Abstract: The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points. Yet, they can be inaccurate under high-dimensional noise, especially if the noise magnitude varies considerably across the data, e.g., under heteroskedasticity or outliers. In this work, we investigate a more robust alternative -- the doubly stochastic normalization of the Gaussian kernel. We consider a setting where points are sampled from an unknown density on a low-dimensional manifold embedded in high-dimensional space and corrupted by possibly strong, non-identically distributed, sub-Gaussian noise. We establish that the doubly stochastic affinity matrix and its scaling factors concentrate around certain population forms, and provide corresponding finite-sample probabilistic error bounds. We then utilize these results to develop several tools for robust inference under general high-dimensional noise. First, we derive a robust density estimator that reliably infers the underlying sampling density and can substantially outperform the standard kernel density estimator under heteroskedasticity and outliers. Second, we obtain estimators for the pointwise noise magnitudes, the pointwise signal magnitudes, and the pairwise Euclidean distances between clean data points. Lastly, we derive robust graph Laplacian normalizations that accurately approximate various manifold Laplacians, including the Laplace Beltrami operator, improving over traditional normalizations in noisy settings. We exemplify our results in simulations and on real single-cell RNA-sequencing data. For the latter, we show that in contrast to traditional methods, our approach is robust to variability in technical noise levels across cell types.
- Julien Ah-Pine. Learning doubly stochastic and nearly idempotent affinity matrix for graph-based clustering. European Journal of Operational Research, 299(3):1069–1078, 2022.
- Much faster algorithms for matrix scaling. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 890–901. IEEE, 2017.
- Nonnegative matrices and applications, volume 64. Cambridge University Press, 1997.
- Determination of signal-to-noise ratios and spectral snrs in cryo-em low-dose imaging of molecules. Journal of structural biology, 166(2):126–132, 2009.
- Mario Beauchemin. On affinity matrix normalization for graph cuts and spectral clustering. Pattern Recognition Letters, 68:90–96, 2015.
- Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
- Variable bandwidth diffusion kernels. Applied and Computational Harmonic Analysis, 40(1):68–96, 2016.
- Entropy minimization, dad problems, and doubly stochastic kernels. Journal of Functional Analysis, 123(2):264–307, 1994.
- Accounting for technical noise in single-cell rna-seq experiments. Nature methods, 10(11):1093–1095, 2013.
- Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
- A non-local algorithm for image denoising. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 2, pages 60–65. IEEE, 2005.
- Improved spectral convergence rates for graph laplacians on epsilon-graphs and k-nn graphs. arXiv preprint arXiv:1910.13476, 2019.
- Robust doubly stochastic graph clustering. Neurocomputing, 475:15–25, 2022.
- Bi-stochastically normalized graph laplacian: convergence to manifold laplacian and robustness to outlier noise. arXiv preprint arXiv:2206.11386, 2022.
- Eigen-convergence of gaussian kernelized graph laplacian by manifold heat interpolation. arXiv preprint arXiv:2101.09875, 2021.
- Statistically weighted principal component analysis of rapid scanning wavelength kinetics experiments. Analytical Chemistry, 49(6):846–853, 1977.
- Diffusion maps. Applied and computational harmonic analysis, 21(1):5–30, 2006.
- Diffusion wavelets. Applied and Computational Harmonic Analysis, 21(1):53–94, 2006.
- A common variable minimax theorem for graphs. Foundations of Computational Mathematics, pages 1–25, 2022.
- Performance assessment and selection of normalization procedures for single-cell rna-seq. Cell systems, 8(4):315–328, 2019.
- Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in neural information processing systems, pages 2292–2300, 2013.
- Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems, pages 3844–3852, 2016.
- A riemannian approach for graph-based clustering by doubly stochastic matrices. In 2018 IEEE Statistical Signal Processing Workshop (SSP), pages 806–810. IEEE, 2018.
- Spectral convergence of graph laplacian and heat kernel reconstruction in l∞superscript𝑙l^{\infty}italic_l start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT from random samples. Applied and Computational Harmonic Analysis, 55:282–336, 2021.
- Noureddine El Karoui et al. On information plus noise kernel random matrices. The Annals of Statistics, 38(5):3191–3216, 2010.
- Graph connection laplacian methods can be made robust to noise. The Annals of Statistics, 44(1):346–372, 2016.
- Alessandro Foi. Clipped noisy images: Heteroskedastic modeling and practical denoising. Signal Processing, 89(12):2609–2629, 2009.
- Santo Fortunato. Community detection in graphs. Physics reports, 486(3-5):75–174, 2010.
- Alexander Grigor’yan. Heat kernels on weighted manifolds and applications. Cont. Math, 398(2006):93–191, 2006.
- Normalization and variance stabilization of single-cell rna-seq data using regularized negative binomial regression. Genome Biology, 20(1):1–15, 2019.
- Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, 2011.
- From graphs to manifolds–weak and strong pointwise consistency of graph laplacians. In International Conference on Computational Learning Theory, pages 470–485. Springer, 2005.
- Richard Henderson. Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise. Proceedings of the National Academy of Sciences, 110(45):18037–18041, 2013.
- A tail inequality for quadratic forms of subgaussian random vectors. Electronic Communications in Probability, 17:1–6, 2012.
- Martin Idel. A review of matrix scaling and sinkhorn’s normal form for matrices and positive maps. arXiv preprint arXiv:1609.06349, 2016.
- Accounting for technical noise in differential expression analysis of single-cell rna sequencing data. Nucleic acids research, 2017.
- Genome-wide analysis reveals no evidence of trans chromosomal regulation of mammalian immune development. PLoS Genetics, 14(6):e1007431, 2018.
- Peter V Kharchenko. The triumphs and limitations of computational methods for scrna-seq. Nature Methods, 18(7):723–732, 2021.
- Characterizing noise structure in single-cell rna-seq distinguishes genuine from technical stochastic allelic expression. Nature communications, 6(1):1–9, 2015.
- Demystifying “drop-outs” in single-cell umi data. Genome biology, 21(1):1–19, 2020.
- A symmetry preserving algorithm for matrix scaling. SIAM journal on Matrix Analysis and Applications, 35(3):931–955, 2014.
- A note concerning simultaneous integral equations. Canadian Journal of Mathematics, 20:855–861, 1968.
- Doubly stochastic normalization of the gaussian kernel is robust to heteroskedastic noise. SIAM journal on mathematics of data science, 3(1):388–413, 2021.
- The steerable graph laplacian and its application to filtering image datasets. SIAM Journal on Imaging Sciences, 11(4):2254–2304, 2018.
- Doubly stochastic subspace clustering. arXiv preprint arXiv:2011.14859, 2020.
- Dual color mesoscopic imaging reveals spatiotemporally heterogeneous coordination of cholinergic and neocortical activity. BioRXiv, 2020.
- Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605, 2008.
- Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell, 161(5):1202–1214, 2015.
- Denoising two-photon calcium imaging data. PloS one, 6(6):e20490, 2011.
- Manifold learning with bi-stochastic kernels. IMA Journal of Applied Mathematics, 84(3):455–482, 2019.
- Perturbation of the eigenvectors of the graph laplacian: Application to image denoising. Applied and Computational Harmonic Analysis, 36(2):326–334, 2014.
- Peyman Milanfar. Symmetrizing smoothing filters. SIAM Journal on Imaging Sciences, 6(1):263–284, 2013.
- Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Applied and Computational Harmonic Analysis, 21(1):113–127, 2006.
- On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems, pages 849–856, 2002.
- Graph laplacian regularization for image denoising: Analysis in the continuous domain. IEEE Transactions on Image Processing, 26(4):1770–1785, 2017.
- Emanuel Parzen. On estimation of a probability density function and mode. The annals of mathematical statistics, 33(3):1065–1076, 1962.
- Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
- Murray Rosenblatt. Remarks on some nonparametric estimates of a density function. The annals of mathematical statistics, pages 832–837, 1956.
- Direct extraction of signal and noise correlations from two-photon calcium imaging of ensemble neuronal activity. Elife, 10:e68046, 2021.
- Poisson noise reduction with non-local pca. Journal of mathematical imaging and vision, 48(2):279–294, 2014.
- Separating measurement and expression models clarifies confusion in single-cell rna sequencing analysis. Nature genetics, 53(6):770–777, 2021.
- Role of normalization in spectral clustering for stochastic blockmodels. The Annals of Statistics, 43(3):962–990, 2015.
- Sjors HW Scheres. A bayesian view on cryo-em structure determination. Journal of molecular biology, 415(2):406–418, 2012.
- Analysis of call centre arrival data using singular value decomposition. Applied Stochastic Models in Business and Industry, 21(3):251–263, 2005.
- Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence, 22(8):888–905, 2000.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98, 2013.
- Amit Singer. From graph to manifold laplacian: The convergence rate. Applied and Computational Harmonic Analysis, 21(1):128–134, 2006.
- Diffusion interpretation of nonlocal neighborhood filters for signal denoising. SIAM Journal on Imaging Sciences, 2(1):118–139, 2009.
- Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2):343–348, 1967.
- Valentine Svensson. Droplet scrna-seq is not zero-inflated. Nature Biotechnology, 38(2):147–150, 2020.
- Correcting systematic effects in a large set of photometric light curves. Monthly Notices of the Royal Astronomical Society, 356(4):1466–1470, 2005.
- mrna-seq whole-transcriptome analysis of a single cell. Nature methods, 6(5):377, 2009.
- Error estimates for spectral convergence of the graph laplacian on random geometric graphs toward the laplace–beltrami operator. Foundations of Computational Mathematics, 20(4):827–887, 2020.
- Geometric structure of graph laplacian embeddings. J. Mach. Learn. Res., 22:63–1, 2021.
- Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
- A systematic evaluation of single cell rna-seq analysis pipelines. Nature communications, 10(1):1–11, 2019.
- Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17(4):395–416, 2007.
- Hanna M Wallach. Topic modeling: beyond bag-of-words. In Proceedings of the 23rd international conference on Machine learning, pages 977–984, 2006.
- Improving clustering by learning a bi-stochastic data similarity matrix. Knowledge and information systems, 32(2):351–382, 2012.
- Spectral convergence of diffusion maps: Improved error bounds and an alternative normalization. SIAM Journal on Numerical Analysis, 59(3):1687–1734, 2021.
- Strong uniform consistency with rates for kernel density estimators with general kernels on manifolds. Information and Inference: A Journal of the IMA, 11(2):781–799, 2022.
- A unifying approach to hard and probabilistic clustering. In Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, volume 1, pages 294–301. IEEE, 2005.
- Doubly stochastic normalization for spectral clustering. In Advances in neural information processing systems, pages 1569–1576, 2007.
- Self-tuning spectral clustering. In Advances in neural information processing systems, pages 1601–1608, 2005.
- Massively parallel digital transcriptional profiling of single cells. Nature communications, 8(1):1–12, 2017.
- Boris Landa (14 papers)
- Xiuyuan Cheng (55 papers)