Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 43 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 96 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-trained Models (2110.03334v2)

Published 7 Oct 2021 in eess.AS

Abstract: Self-supervised pre-training is an effective approach to leveraging a large amount of unlabelled data to reduce word error rates (WERs) of automatic speech recognition (ASR) systems. Since it is impractical to use large pre-trained models for many real-world ASR applications, it is desirable to have a much smaller model while retaining the performance of the pre-trained model. In this paper, we propose a simple knowledge distillation (KD) loss function for neural transducers that focuses on the one-best path in the output probability lattice under both streaming and non-streaming setups, which allows a small student model to approach the performance of the large pre-trained teacher model. Experiments on the LibriSpeech dataset show that despite being 10 times smaller than the teacher model, the proposed loss results in relative WER reductions (WERRs) of 11.5% and 6.8% on the test-other set for non-streaming and streaming student models compared to the baseline transducers trained without KD using the labelled 100-hour clean data. With an additional 860 hours of unlabelled data for KD, the WERRs increase to 48.2% and 38.5% for non-streaming and streaming students. If LLM shallow fusion is used for producing distillation targets, a further improvement in the student model is observed.

Citations (19)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.