Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition (2210.05614v2)

Published 11 Oct 2022 in cs.SD, cs.LG, cs.NE, and eess.AS

Abstract: Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data. Such a noise perturbation often results in a severe performance degradation in automatic speech recognition (ASR) in order to meet a privacy budget $\varepsilon$. Private aggregation of teacher ensemble (PATE) utilizes ensemble probabilities to improve ASR accuracy when dealing with the noise effects controlled by small values of $\varepsilon$. We extend PATE learning to work with dynamic patterns, namely speech utterances, and perform a first experimental demonstration that it prevents acoustic data leakage in ASR training. We evaluate three end-to-end deep models, including LAS, hybrid CTC/attention, and RNN transducer, on the open-source LibriSpeech and TIMIT corpora. PATE learning-enhanced ASR models outperform the benchmark DP-SGD mechanisms, especially under strict DP budgets, giving relative word error rate reductions between 26.2% and 27.5% for an RNN transducer model evaluated with LibriSpeech. We also introduce a DP-preserving ASR solution for pretraining on public speech corpora.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Chao-Han Huck Yang (89 papers)
  2. I-Fan Chen (8 papers)
  3. Andreas Stolcke (57 papers)
  4. Sabato Marco Siniscalchi (46 papers)
  5. Chin-Hui Lee (52 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.