Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An introduction to graphical tensor notation for mechanistic interpretability (2402.01790v1)

Published 2 Feb 2024 in cs.LG and cs.AI

Abstract: Graphical tensor notation is a simple way of denoting linear operations on tensors, originating from physics. Modern deep learning consists almost entirely of operations on or between tensors, so easily understanding tensor operations is quite important for understanding these systems. This is especially true when attempting to reverse-engineer the algorithms learned by a neural network in order to understand its behavior: a field known as mechanistic interpretability. It's often easy to get confused about which operations are happening between tensors and lose sight of the overall structure, but graphical tensor notation makes it easier to parse things at a glance and see interesting equivalences. The first half of this document introduces the notation and applies it to some decompositions (SVD, CP, Tucker, and tensor network decompositions), while the second half applies it to some existing some foundational approaches for mechanistically understanding LLMs, loosely following A Mathematical Framework for Transformer Circuits'', then constructing an exampleinduction head'' circuit in graphical tensor notation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. the Tensor Network, Jul 2023. URL https://tensornetwork.org. [Online; accessed 6. Jul. 2023].
  2. Roger Penrose. Applications of Negative Dimensional Tensors. Combinatorial Mathematics and its Applications, pages 221–244, 1971.
  3. Johnnie Gray. quimb: A python package for quantum information and many-body calculations. Journal of Open Source Software, 3(29):819, September 2018. ISSN 2475-9066. doi: 10.21105/joss.00819.
  4. A mathematical framework for transformer circuits. Transformer Circuits Thread, 2021. https://transformer-circuits.pub/2021/framework/index.html.
  5. Tai-Danae Bradley. Matrices as Tensor Network Diagrams, Jun 2021. URL https://www.math3ma.com/blog/matrices-as-tensor-network-diagrams. [Online; accessed 18. Jun. 2021].
  6. Simon Verret. Tensor network diagrams of typical neural networks. Simon Verret’s, Feb 2019. URL https://simonverret.github.io/2019/02/16/tensor-network-diagrams-of-typical-neural-network.html.
  7. Glen Evenbly. Tensors.net, July 2023. URL https://www.tensors.net. [Online; accessed 6. Jul. 2023].
  8. Hand-waving and interpretive dance: an introductory course on tensor networks. J. Phys. A: Math. Theor., 50(22):223001, May 2017. ISSN 1751-8113. doi: 10.1088/1751-8121/aa6dc3.
  9. Graph Tensor Networks: An Intuitive Framework for Designing Large-Scale Neural Learning Systems on Multiple Domains. arXiv, March 2023. doi: 10.48550/arXiv.2303.13565.
  10. On Optimizing a Class of Multi-Dimensional Loops with Reduction for Parallel Execution. Parallel Process. Lett., 07(02):157–168, Jun 1997. ISSN 0129-6264. doi: 10.1142/S0129626497000176.
  11. The complexity of tensor calculus. Comput. Complexity, 11(1):54–89, June 2002. ISSN 1420-8954. doi: 10.1007/s00037-000-0170-4.
  12. CallumMcDougall. Six (and a half) intuitions for SVD, July 2023. URL https://www.lesswrong.com/posts/iupCxk3ddiJBAJkts/six-and-a-half-intuitions-for-svd. [Online; accessed 1. Feb. 2024].
  13. Erhard Schmidt. Zur Theorie der linearen und nichtlinearen Integralgleichungen. Math. Ann., 63(4):433–476, December 1907. ISSN 1432-1807. doi: 10.1007/BF01449770.
  14. The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211–218, September 1936. ISSN 1860-0980. doi: 10.1007/BF02288367.
  15. L. Mirsky. SYMMETRIC GAUGE FUNCTIONS AND UNITARILY INVARIANT NORMS. Q. J. Math., 11(1):50–59, January 1960. ISSN 0033-5606. doi: 10.1093/qmath/11.1.50.
  16. Frank L. Hitchcock. The Expression of a Tensor or a Polyadic as a Sum of Products. J. Math. Phys., 6(1-4):164–189, April 1927. ISSN 0097-1421. doi: 10.1002/sapm192761164.
  17. Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3):283–319, September 1970. ISSN 1860-0980. doi: 10.1007/BF02310791.
  18. Richard A. Harshman. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis. UCLA Working Papers in Phonetics, 16:1–84, 1970.
  19. Ledyard R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3):279–311, September 1966. ISSN 1860-0980. doi: 10.1007/BF02289464.
  20. Johan Håstad. Tensor rank is NP-complete. J. Algorithms, 11(4):644–654, December 1990. ISSN 0196-6774. doi: 10.1016/0196-6774(90)90014-6.
  21. Finitely correlated states on quantum spin chains. Commun. Math. Phys., 144(3):443–490, March 1992. ISSN 1432-0916. doi: 10.1007/BF02099178.
  22. Groundstate properties of a generalized VBS-model. Z. Phys. B: Condens. Matter, 87(3):281–287, October 1992. ISSN 1431-584X. doi: 10.1007/BF01309281.
  23. Thermodynamic Limit of Density Matrix Renormalization. Phys. Rev. Lett., 75(19):3537–3540, November 1995. ISSN 1079-7114. doi: 10.1103/PhysRevLett.75.3537.
  24. Guifré Vidal. Efficient Classical Simulation of Slightly Entangled Quantum Computations. Phys. Rev. Lett., 91(14):147902, October 2003. ISSN 1079-7114. doi: 10.1103/PhysRevLett.91.147902.
  25. Ian P. McCulloch. From density-matrix renormalization group to matrix product states. J. Stat. Mech.: Theory Exp., 2007(10):P10014, October 2007. ISSN 1742-5468. doi: 10.1088/1742-5468/2007/10/P10014.
  26. I. V. Oseledets. Tensor-Train Decomposition. SIAM J. Sci. Comput., September 2011. URL https://epubs.siam.org/doi/abs/10.1137/090752286.
  27. M. B. Hastings. An area law for one-dimensional quantum systems. J. Stat. Mech.: Theory Exp., 2007(08):P08024–P08024, Aug 2007. ISSN 1742-5468. doi: 10.1088/1742-5468/2007/08/p08024.
  28. Lluís Masanes. Area law for the entropy of low-energy states. Phys. Rev. A, 80(5):052104, Nov 2009. ISSN 2469-9934. doi: 10.1103/PhysRevA.80.052104.
  29. An area law and sub-exponential algorithm for 1D systems. arXiv, Jan 2013. URL https://arxiv.org/abs/1301.1162v1.
  30. An area law for 2D frustration-free spin systems. arXiv, Mar 2021. URL https://arxiv.org/abs/2103.02492v2.
  31. Transformer feed-forward layers are key-value memories. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5484–5495, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.emnlp-main.446. URL https://aclanthology.org/2021.emnlp-main.446.
  32. Locating and editing factual associations in GPT. Advances in Neural Information Processing Systems, 36, 2022.
  33. In-context learning and induction heads. Transformer Circuits Thread, 2022. https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html.
  34. Towards Automated Circuit Discovery for Mechanistic Interpretability. arXiv, April 2023. doi: 10.48550/arXiv.2304.14997.
Citations (1)

Summary

We haven't generated a summary for this paper yet.