Word-level Speech Recognition with a Letter to Word Encoder (1906.04323v2)

Published 10 Jun 2019 in cs.CL, cs.SD, and eess.AS

Abstract: We propose a direct-to-word sequence model which uses a word network to learn word embeddings from letters. The word network can be integrated seamlessly with arbitrary sequence models including Connectionist Temporal Classification and encoder-decoder models with attention. We show our direct-to-word model can achieve word error rate gains over sub-word level models for speech recognition. We also show that our direct-to-word approach retains the ability to predict words not seen at training time without any retraining. Finally, we demonstrate that a word-level model can use a larger stride than a sub-word level model while maintaining accuracy. This makes the model more efficient both for training and inference.

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Word-level Speech Recognition with a Letter to Word Encoder (1906.04323v2)

Summary

Related Papers