Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay (2401.01599v3)

Published 3 Jan 2024 in cs.LG, math.ST, and stat.TH

Abstract: The generalization error curve of certain kernel regression method aims at determining the exact order of generalization error with various source condition, noise level and choice of the regularization parameter rather than the minimax rate. In this work, under mild assumptions, we rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method (and a large class of analytic spectral algorithms) in kernel regression. Consequently, we could sharpen the near inconsistency of kernel interpolation and clarify the saturation effects of kernel regression algorithms with higher qualification, etc. Thanks to the neural tangent kernel theory, these results greatly improve our understanding of the generalization behavior of training the wide neural networks. A novel technical contribution, the analytic functional argument, might be of independent interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Neural tangent kernel: Convergence and generalization in neural networks, in: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett (Eds.), Advances in Neural Information Processing Systems, volume 31, Curran Associates, Inc., 2018. URL: https://proceedings.neurips.cc/paper/2018/file/5a4be1fa34e62bb8a6ec6b91d2462f5a-Paper.pdf.
  2. Wide neural networks of any depth evolve as linear models under gradient descent, in: Advances in Neural Information Processing Systems, volume 32, Curran Associates, Inc., 2019. URL: https://proceedings.neurips.cc/paper/2019/hash/0d1a9651497a38d8b1c3871c84528bd4-Abstract.html.
  3. Benign overfitting in linear regression, Proceedings of the National Academy of Sciences 117 (2020) 30063–30070. doi:10.1073/pnas.1907378117. arXiv:1906.11300.
  4. On the saturation effect of kernel ridge regression, in: International Conference on Learning Representations, 2023. URL: https://openreview.net/forum?id=tFvr-kYWs_Y.
  5. Spectrum dependent learning curves in kernel regression and wide neural networks, in: Proceedings of the 37th International Conference on Machine Learning, PMLR, 2020, pp. 1024–1034. URL: https://proceedings.mlr.press/v119/bordelon20a.html.
  6. Generalization error rates in kernel regression: The crossover from the noiseless to noisy regime, Advances in Neural Information Processing Systems 34 (2021) 10131–10143.
  7. On the asymptotic learning curves of kernel ridge regression under power-law decay, in: Thirty-Seventh Conference on Neural Information Processing Systems, 2023. URL: https://openreview.net/forum?id=E4P5kVSKlT&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DNeurIPS.cc%2F2023%2FConference%2FAuthors%23your-submissions).
  8. A. Caponnetto, E. De Vito, Optimal rates for the regularized least-squares algorithm, Foundations of Computational Mathematics 7 (2007) 331–368. doi:10.1007/s10208-006-0196-8.
  9. Spectral methods for regularization in learning theory, DISI, Universita degli Studi di Genova, Italy, Technical Report DISI-TR-05-18 (2005). URL: https://www.semanticscholar.org/paper/849ef0790f23c4b3aab40ecf2c47b6127cf4e1a8.
  10. Spectral algorithms for supervised learning, Neural Computation 20 (2008) 1873–1897. doi:10.1162/neco.2008.05-07-517.
  11. G. Blanchard, N. Mücke, Optimal rates for regularization of statistical inverse learning problems, Foundations of Computational Mathematics 18 (2018) 971–1013. doi:10.1007/s10208-017-9359-7.
  12. Optimal rates for spectral algorithms with least-squares regression over Hilbert spaces, Applied and Computational Harmonic Analysis 48 (2018) 868–890. doi:10.1016/j.acha.2018.09.009.
  13. Optimal Rates for Regularized Least Squares Regression., in: COLT, 2009, pp. 79–93. URL: http://www.learningtheory.org/colt2009/papers/038.pdf.
  14. S.-R. Fischer, I. Steinwart, Sobolev norm learning rates for regularized least-squares algorithms, Journal of Machine Learning Research 21 (2020) 205:1–205:38. URL: https://www.semanticscholar.org/paper/248fb62f75dac19f02f683cecc2bf4929f3fcf6d.
  15. On the optimality of misspecified kernel ridge regression, in: International Conference on Machine Learning, 2023. URL: https://openreview.net/forum?id=Kg2al3GXBR.
  16. T. Zhang, B. Yu, Boosting with early stopping: Convergence and consistency, The Annals of Statistics 33 (2005) 1538–1579. doi:10.1214/009053605000000255.
  17. On early stopping in gradient descent learning, Constructive Approximation 26 (2007) 289–315. doi:10.1007/s00365-006-0663-2.
  18. On regularization algorithms in learning theory, Journal of complexity 23 (2007) 52–72. doi:10.1016/j.jco.2006.07.001.
  19. A. Caponnetto, Optimal rates for regularization operators in learning theory (2006).
  20. S. Buchholz, Kernel interpolation in Sobolev spaces is not consistent in low dimensions, in: Conference on Learning Theory, PMLR, 2022, pp. 3410–3440. URL: https://proceedings.mlr.press/v178/buchholz22a.html.
  21. Kernel interpolation generalizes poorly, Biometrika (2023) asad048. doi:10.1093/biomet/asad048. arXiv:2303.15809.
  22. T. Liang, A. Rakhlin, Just interpolate: Kernel ”ridgeless” regression can generalize, The Annals of Statistics 48 (2020). doi:10.1214/19-AOS1849. arXiv:1808.00387.
  23. I. Steinwart, C. Scovel, Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs (2012). doi:10.1007/S00365-012-9153-3.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.