Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 170 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 80 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 432 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Automatic Differentiation for Tensor Algebras (1711.01348v1)

Published 3 Nov 2017 in cs.SC and stat.ML

Abstract: Kjolstad et. al. proposed a tensor algebra compiler. It takes expressions that define a tensor element-wise, such as $f_{ij}(a,b,c,d) = \exp\left[-\sum_{k=0}4 \left((a_{ik}+b_{jk})2\, c_{ii} + d_{i+k}3 \right) \right]$, and generates the corresponding compute kernel code. For machine learning, especially deep learning, it is often necessary to compute the gradient of a loss function $l(a,b,c,d)=l(f(a,b,c,d))$ with respect to parameters $a,b,c,d$. If tensor compilers are to be applied in this field, it is necessary to derive expressions for the derivatives of element-wise defined tensors, i.e. expressions for $(da){ik}=\partial l/\partial a{ik}$. When the mapping between function indices and argument indices is not 1:1, special attention is required. For the function $f_{ij} (x) = x_i2$, the derivative of the loss is $(dx)i=\partial l/\partial x_i=\sum_j (df){ij}2x_i$; the sum is necessary because index $j$ does not appear in the indices of $f$. Another example is $f_{i}(x)=x_{ii}2$, where $x$ is a matrix; here we have $(dx){ij}=\delta{ij}(df)i2x{ii}$; the Kronecker delta is necessary because the derivative is zero for off-diagonal elements. Another indexing scheme is used by $f_{ij}(x)=\exp x_{i+j}$; here the correct derivative is $(dx){k}=\sum_i (df){i,k-i} \exp x_{k}$, where the range of the sum must be chosen appropriately. In this publication we present an algorithm that can handle any case in which the indices of an argument are an arbitrary linear combination of the indices of the function, thus all the above examples can be handled. Sums (and their ranges) and Kronecker deltas are automatically inserted into the derivatives as necessary. Additionally, the indices are transformed, if required (as in the last example). The algorithm outputs a symbolic expression that can be subsequently fed into a tensor algebra compiler. Source code is provided.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.