Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 49 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

BP(λ): Online Learning via Synthetic Gradients (2401.07044v1)

Published 13 Jan 2024 in cs.LG

Abstract: Training recurrent neural networks typically relies on backpropagation through time (BPTT). BPTT depends on forward and backward passes to be completed, rendering the network locked to these computations before loss gradients are available. Recently, Jaderberg et al. proposed synthetic gradients to alleviate the need for full BPTT. In their implementation synthetic gradients are learned through a mixture of backpropagated gradients and bootstrapped synthetic gradients, analogous to the temporal difference (TD) algorithm in Reinforcement Learning (RL). However, as in TD learning, heavy use of bootstrapping can result in bias which leads to poor synthetic gradient estimates. Inspired by the accumulate $\mathrm{TD}(\lambda)$ in RL, we propose a fully online method for learning synthetic gradients which avoids the use of BPTT altogether: accumulate $BP(\lambda)$. As in accumulate $\mathrm{TD}(\lambda)$, we show analytically that accumulate $\mathrm{BP}(\lambda)$ can control the level of bias by using a mixture of temporal difference errors and recursively defined eligibility traces. We next demonstrate empirically that our model outperforms the original implementation for learning synthetic gradients in a variety of tasks, and is particularly suited for capturing longer timescales. Finally, building on recent work we reflect on accumulate $\mathrm{BP}(\lambda)$ as a principle for learning in biological circuits. In summary, inspired by RL principles we introduce an algorithm capable of bias-free online learning via synthetic gradients.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Distributional coding of associative learning within projection-defined populations of midbrain dopamine neurons. bioRxiv, 2022.
  2. Circuit architecture of vta dopamine neurons revealed by systematic input-output mapping. Cell, 162(3):622–634, 2015.
  3. Cerebro-cerebellar networks facilitate learning through feedback decoupling. Nature Communications, 14(1):1–18, 2023.
  4. Understanding synthetic gradients and decoupled neural interfaces. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.  904–912. JMLR. org, 2017.
  5. Li Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
  6. Eligibility traces and plasticity on behavioral time scales: experimental support of neohebbian three-factor learning rules. Frontiers in neural circuits, 12:53, 2018.
  7. Neural turing machines. arXiv preprint arXiv:1410.5401, 2014.
  8. Gradient descent happens in a tiny subspace. arXiv preprint arXiv:1812.04754, 2018.
  9. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735. URL https://doi.org/10.1162/neco.1997.9.8.1735.
  10. Dopamine neurons report an error in the temporal prediction of reward during learning. Nature neuroscience, 1(4):304–309, 1998.
  11. Decoupled neural interfaces using synthetic gradients. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.  1627–1635. JMLR. org, 2017.
  12. Cerebellar supervised learning revisited: biophysical modeling and degrees-of-freedom control. Current opinion in neurobiology, 21(5):791–800, 2011.
  13. 50 years since the Marr, Ito, and Albus models of the cerebellum. Neuroscience, 462:151–174, 2021.
  14. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  15. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941, 2015.
  16. Intact-brain analyses reveal distinct information carried by snc dopamine subcircuits. Cell, 162(3):635–647, 2015.
  17. Backpropagation through time and the brain. Current opinion in neurobiology, 55:82–89, 2019.
  18. Evaluating biological plausibility of learning algorithms the lazy way. In Real Neurons & Hidden Units: Future directions at the intersection of neuroscience and artificial intelligence @ NeurIPS 2019, 2019. URL https://openreview.net/forum?id=HJgPEXtIUS.
  19. A unified framework of online learning algorithms for training recurrent neural networks. Journal of Machine Learning Research, 21(135):1–34, 2020.
  20. Vijay Mohan K Namboodiri and Garret D Stuber. The learning of prospective and retrospective cognitive maps within neural circuits. Neuron, 109(22):3552–3575, 2021.
  21. Medina JF Ohmae S. Plasticity of ponto-cerebellar circuits generates a prospective error signal in climbing fiber. Program No. 579.01. Neuroscience 2019 Abstracts. Chicago, IL: Society for Neuroscience, 2019. Online, 2019.
  22. Cortico-cerebellar networks as decoupling neural interfaces. Advances in Neural Information Processing Systems, 34, 2021.
  23. Current state and future directions for learning in biological recurrent neural networks: A perspective piece. arXiv preprint arXiv:2105.05382, 2021.
  24. Reinforcement learning: An introduction. MIT press, 2018.
  25. Long range arena: A benchmark for efficient transformers. arXiv preprint arXiv:2011.04006, 2020.
  26. True online temporal-difference learning. The Journal of Machine Learning Research, 17(1):5057–5096, 2016.
  27. A learning algorithm for continually running fully recurrent neural networks. Neural computation, 1(2):270–280, 1989.
  28. A critical time window for dopamine actions on the structural plasticity of dendritic spines. Science, 345(6204):1616–1620, 2014.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.