Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 167 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 24 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 79 tok/s Pro

Kimi K2 160 tok/s Pro

GPT OSS 120B 430 tok/s Pro

Claude Sonnet 4.5 33 tok/s Pro

2000 character limit reached

Analysis of memory in LSTM-RNNs for source separation (2009.00551v1)

Published 1 Sep 2020 in eess.AS and cs.SD

Abstract: Long short-term memory recurrent neural networks (LSTM-RNNs) are considered state-of-the art in many speech processing tasks. The recurrence in the network, in principle, allows any input to be remembered for an indefinite time, a feature very useful for sequential data like speech. However, very little is known about which information is actually stored in the LSTM and for how long. We address this problem by using a memory reset approach which allows us to evaluate network performance depending on the allowed memory time span. We apply this approach to the task of multi-speaker source separation, but it can be used for any task using RNNs. We find a strong performance effect of short-term (shorter than 100 milliseconds) linguistic processes. Only speaker characteristics are kept in the memory for longer than 400 milliseconds. Furthermore, we confirm that performance-wise it is sufficient to implement longer memory in deeper layers. Finally, in a bidirectional model, the backward models contributes slightly more to the separation performance than the forward model.