CoNST: Code Generator for Sparse Tensor Networks (2401.04836v1)
Abstract: Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication. Tensor networks arise commonly in many domains of scientific computing and data science. After a transformation into a tree of binary contractions, the network is implemented as a sequence of individual contractions. Several critical aspects must be considered in the generation of efficient code for a contraction tree, including sparse tensor layout mode order, loop fusion to reduce intermediate tensors, and the interdependence of loop order, mode order, and contraction order. We propose CoNST, a novel approach that considers these factors in an integrated manner using a single formulation. Our approach creates a constraint system that encodes these decisions and their interdependence, while aiming to produce reduced-order intermediate tensors via fusion. The constraint system is solved by the Z3 SMT solver and the result is used to create the desired fused loop structure and tensor mode layouts for the entire contraction tree. This structure is lowered to the IR of the TACO compiler, which is then used to generate executable code. Our experimental evaluation demonstrates very significant (sometimes orders of magnitude) performance improvements over current state-of-the-art sparse tensor compiler/library alternatives.
- Rodney J Bartlett and Monika Musiał. 2007. Coupled-cluster theory in quantum chemistry. Reviews of Modern Physics 79, 1 (2007), 291.
- Compiler support for sparse tensor computations in MLIR. ACM Transactions on Architecture and Code Optimization 19, 4, Article 50 (2022), 25 pages.
- Runtime composition of iterations for fusing loop-carried sparse dependence. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Article 89, 15 pages.
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems. 337–340.
- SparseLNR: Accelerating sparse tensor computations using loop nest restructuring. In Proceedings of the 36th ACM International Conference on Supercomputing. 1–14.
- Gaussian 09 Revision E.01. Gaussian Inc. Wallingford CT 2009.
- Christof Haettig. 2005. Optimization of auxiliary basis sets for RI-MP2 and RI-CC2 calculations: Core-valence and quintuple-ζ𝜁\zetaitalic_ζ basis sets for H to Ar and QZVPP basis sets for Li to Kr. Physical chemistry chemical physics : PCCP 7 (01 2005), 59–66. https://doi.org/10.1039/B415208E
- Arnim Hellweg and Dmitrij Rappoport. 2014. Development of new auxiliary basis functions of the Karlsruhe segmented contracted basis sets including diffuse basis functions (def2-SVPD, def2-TZVPPD, and def2-QVPPD) for RI-MP2 and RI-CC calculations. Phys. Chem. Chem. Phys. 17 (11 2014). https://doi.org/10.1039/C4CP04286G
- So Hirata. 2003. Tensor contraction engine: Abstraction and automated parallel implementation of configuration-interaction, coupled-cluster, and many-body perturbation theories. The Journal of Physical Chemistry A 107, 46 (2003), 9887–9897.
- Raghavendra Kanakagiri and Edgar Solomonik. 2023. Minimum cost loop nests for contraction of a sparse tensor with a tensor network. arXiv preprint arXiv:2307.05740 (2023).
- Tensor algebra compilation with workspaces. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 180–192.
- The tensor algebra compiler. Proceedings of the ACM on Programming Languages 1, OOPSLA (2017), 1–29.
- Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM Rev. 51, 3 (2009), 455–500.
- Indexed Streams: A formal intermediate representation for fused contraction programs. Proceedings of the ACM on Programming Languages 7, PLDI (2023), 1169–1193.
- Creating two-dimensional solid helium via diamond lattice confinement. Nature Communications 13 (10 2022). https://doi.org/10.1038/s41467-022-33601-5
- Athena: High-performance sparse tensor contraction sequence on heterogeneous memory. In Proceedings of the ACM International Conference on Supercomputing. 190–202.
- Sparta: High-performance, element-wise sparse tensor contraction on heterogeneous memory. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 318–333.
- Comet: A domain-specific compilation of high-performance computational chemistry. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 87–103.
- The ORCA quantum chemistry program package. The Journal of chemical physics 152, 22 (2020).
- Nvidia. 2020. cuTENSOR: A high-performance CUDA library for tensor primitives. https://docs.nvidia.com/cuda/cutensor/index.html.
- Massively parallel quantum chemistry: A high-performance research platform for electronic structure. The Journal of Chemical Physics 153, 4 (07 2020), 044120. https://doi.org/10.1063/5.0005889 arXiv:https://pubs.aip.org/aip/jcp/article-pdf/doi/10.1063/5.0005889/16709494/044120_1_online.pdf
- Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. I. An efficient and simple linear scaling local MP2 method that uses an intermediate basis of pair natural orbitals. The Journal of chemical physics 143, 3 (2015).
- Peter Pulay. 1983. Localizability of dynamic electron correlation. Chemical physics letters 100, 2 (1983), 151–154.
- Sparse maps—A systematic infrastructure for reduced-scaling electronic structure methods. II. Linear scaling domain based pair natural orbital coupled cluster theory. The Journal of chemical physics 144, 2 (2016).
- FROSTT: The Formidable Repository of Open Sparse Tensors and Tools. http://frostt.io/
- SPLATT: Efficient and parallel sparse tensor-matrix multiplication. In 2015 IEEE International Parallel and Distributed Processing Symposium. 61–70.
- Cyclops Tensor Framework: Reducing communication and eliminating load imbalance in massively parallel contractions. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. 813–824.
- The sparse polyhedral framework: Composing compiler-generated inspector-executor code. Proc. IEEE 106, 11 (2018), 1921–1934.
- The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code. Proc. IEEE 106, 11 (2018), 1921–1934.
- A high-performance sparse tensor algebra compiler in multi-level IR. arXiv preprint arXiv:2102.05187 (2021).
- Efficient use of the correlation consistent basis sets in resolution of the identity MP2 calculations. The Journal of Chemical Physics 116, 8 (02 2002), 3175–3183. https://doi.org/10.1063/1.1445115 arXiv:https://pubs.aip.org/aip/jcp/article-pdf/116/8/3175/10841034/3175_1_online.pdf
- David E. Woon and Jr. Dunning, Thom H. 1994. Gaussian basis sets for use in correlated molecular calculations. IV. Calculation of static electrical response properties. The Journal of Chemical Physics 100, 4 (02 1994), 2975–2988. https://doi.org/10.1063/1.466439 arXiv:https://pubs.aip.org/aip/jcp/article-pdf/100/4/2975/10771441/2975_1_online.pdf
- SparseTIR: Composable abstractions for sparse compilation in deep learning. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3. 660–678.
- Polyhedral specification and code generation of sparse tensor contraction with co-iteration. ACM Transactions on Architecture and Code Optimization 20, 1 (2022), 1–26.
- Saurabh Raje (5 papers)
- Yufan Xu (11 papers)
- Atanas Rountev (4 papers)
- Edward F. Valeev (37 papers)
- Saday Sadayappan (1 paper)