Papers
Topics
Authors
Recent
2000 character limit reached

Interpretable Filter Learning Using Soft Self-attention For Raw Waveform Speech Recognition (2001.07067v1)

Published 20 Jan 2020 in eess.AS

Abstract: Speech recognition from raw waveform involves learning the spectral decomposition of the signal in the first layer of the neural acoustic model using a convolution layer. In this work, we propose a raw waveform convolutional filter learning approach using soft self-attention. The acoustic filter bank in the proposed model is implemented using a parametric cosine-modulated Gaussian filter bank whose parameters are learned. A network-in-network architecture provides self-attention to generate attention weights over the sub-band filters. The attention weighted log filter bank energies are fed to the acoustic model for the task of speech recognition. Experiments are conducted on Aurora-4 (additive noise with channel artifact), and CHiME-3 (additive noise with reverberation) databases. In these experiments, the attention based filter learning approach provides considerable improvements in ASR performance over the baseline mel filter-bank features and other robust front-ends (average relative improvement of 7% in word error rate over baseline features on Aurora-4 dataset, and 5% on CHiME-3 database). Using the self-attention weights, we also present an analysis on the interpretability of the filters for the ASR task.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.