Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 39 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 97 tok/s Pro
Kimi K2 164 tok/s Pro
GPT OSS 120B 466 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Sketching the Heat Kernel: Using Gaussian Processes to Embed Data (2403.07929v1)

Published 1 Mar 2024 in cs.LG, cs.NA, math.NA, and stat.ML

Abstract: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian process to be the heat kernel, and computing the embedding amounts to sketching a matrix representing the heat kernel. The Karhunen-Lo`eve expansion reveals that the straight-line distances in the embedding approximate the diffusion distance in a probabilistic sense, avoiding the need for sharp cutoffs and maintaining some of the smaller-scale structure. Our method demonstrates further advantage in its robustness to outliers. We justify the approach with both theory and experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Convergence of the reach for a sequence of Gaussian-embedded manifolds. Probab. Theory Related Fields, 171(3-4):1045–1091, 2018.
  2. Random fields and geometry, volume 80. Springer, 2007.
  3. Jonathan Bates. The embedding dimension of Laplacian eigenfunction maps. Appl. Comput. Harmon. Anal., 37(3):516–530, 2014.
  4. Laplacian eigenmaps for dimensionality reduction and data representation. Neural computation, 15(6):1373–1396, 2003.
  5. Embedding Riemannian manifolds by their heat kernel. Geom. Funct. Anal., 4(4):373–398, 1994.
  6. Diffusion maps for changing data. Appl. Comput. Harmon. Anal., 36(1):79–107, 2014.
  7. Diffusion maps. Appl. Comput. Harmon. Anal., 21(1):5–30, 2006.
  8. On the convergence rate of sinkhorn’s algorithm, 2022.
  9. Universal local parametrizations via heat kernels and eigenfunctions of the Laplacian. Ann. Acad. Sci. Fenn. Math., 35(1):131–174, 2010.
  10. Philip A. Knight. The Sinkhorn-Knopp algorithm: convergence and applications. SIAM J. Matrix Anal. Appl., 30(1):261–275, 2008.
  11. The intrinsic geometry of some random manifolds. Electron. Commun. Probab., 22:Paper No. 1, 12, 2017.
  12. Stephane S. Lafon. Diffusion maps and geometric harmonics. ProQuest LLC, Ann Arbor, MI, 2004. Thesis (Ph.D.)–Yale University.
  13. Doubly stochastic normalization of the Gaussian kernel is robust to heteroskedastic noise. SIAM J. Math. Data Sci., 3(1):388–413, 2021.
  14. Spectral methods for uncertainty quantification. Scientific Computation. Springer, New York, 2010. With applications to computational fluid dynamics.
  15. Probability in Banach spaces. Classics in Mathematics. Springer-Verlag, Berlin, 2011. Isoperimetry and processes, Reprint of the 1991 edition.
  16. John M. Lee. Introduction to smooth manifolds, volume 218 of Graduate Texts in Mathematics. Springer, New York, second edition, 2013.
  17. Manifold learning with bi-stochastic kernels. IMA J. Appl. Math., 84(3):455–482, 2019.
  18. Per-Gunnar Martinsson. Randomized methods for matrix computations, 2019.
  19. Stephen Semmes. On the nonexistence of bi-Lipschitz parameterizations and geometric problems about A∞subscript𝐴A_{\infty}italic_A start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT-weights. Rev. Mat. Iberoamericana, 12(2):337–410, 1996.
  20. K. T. Sturm. Diffusion processes and heat kernels on metric spaces. Ann. Probab., 26(1):1–55, 1998.
  21. Roman Vershynin. High-dimensional probability, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018. An introduction with applications in data science, With a foreword by Sara van de Geer.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com