A Generalized Language Model in Tensor Space (1901.11167v1)

Published 31 Jan 2019 in cs.CL and cs.LG

Abstract: In the literature, tensors have been effectively used for capturing the context information in LLMs. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a LLM named Tensor Space LLM (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram LLM. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for LLMing. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.

Citations (16)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Tensor Ring Decomposition (2016)
Multi-Tensor Network Representation for High-Order Tensor Completion (2021)
Language Modeling Using Tensor Trains (2024)
Searching to Sparsify Tensor Decomposition for N-ary Relational Data (2021)
Modeling of languages for tensor manipulation (2018)