Emergent Mind

Nested LSTMs

(1801.10308)
Published Jan 31, 2018 in cs.CL and cs.LG

Abstract

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Specifically, instead of computing the value of the (outer) memory cell as $c{outer}_t = ft \odot c{t-1} + it \odot gt$, NLSTM memory cells use the concatenation $(ft \odot c{t-1}, it \odot gt)$ as input to an inner LSTM (or NLSTM) memory cell, and set $c{outer}_t$ = $h{inner}_t$. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.