Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Noise Masking Attacks and Defenses for Pretrained Speech Models (2404.02052v1)

Published 2 Apr 2024 in cs.LG

Abstract: Speech models are often trained on sensitive data in order to improve model performance, leading to potential privacy leakage. Our work considers noise masking attacks, introduced by Amid et al. 2022, which attack automatic speech recognition (ASR) models by requesting a transcript of an utterance which is partially replaced with noise. They show that when a record has been seen at training time, the model will transcribe the noisy record with its memorized sensitive transcript. In our work, we extend these attacks beyond ASR models, to attack pretrained speech encoders. Our method fine-tunes the encoder to produce an ASR model, and then performs noise masking on this model, which we find recovers private information from the pretraining data, despite the model never having seen transcripts at pretraining time! We show how to improve the precision of these attacks and investigate a number of countermeasures to our attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. E. Amid, O. D. Thakkar, A. Narayanan, R. Mathews, and F. Beaufays, “Extracting targeted training data from ASR models, and how to mitigate it,” in Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association, 2022.
  2. C. Chiu, J. Qin, Y. Zhang, J. Yu, and Y. Wu, “Self-supervised learning with random-projection quantizer for speech recognition,” in International Conference on Machine Learning, ICML 2022, 2022.
  3. Y. Zhang, W. Han, J. Qin, Y. Wang, A. Bapna, Z. Chen, N. Chen, B. Li, V. Axelrod, G. Wang et al., “Google usm: Scaling automatic speech recognition beyond 100 languages,” arXiv preprint arXiv:2303.01037, 2023.
  4. R. Shokri, M. Stronati, C. Song, and V. Shmatikov, “Membership inference attacks against machine learning models,” in 2017 IEEE symposium on security and privacy (SP).   IEEE, 2017, pp. 3–18.
  5. N. Carlini, F. Tramer, E. Wallace, M. Jagielski, A. Herbert-Voss, K. Lee, A. Roberts, T. Brown, D. Song, U. Erlingsson et al., “Extracting training data from large language models,” in 30th USENIX Security Symposium (USENIX Security 21), 2021.
  6. W. R. Huang, S. Chien, O. D. Thakkar, and R. Mathews, “Detecting unintended memorization in language-model-fused ASR,” in Interspeech 2022, 23rd Annual Conference of the International Speech Communication Association.   ISCA, 2022, pp. 2808–2812.
  7. C. Guo, F. Bordes, P. Vincent, and K. Chaudhuri, “Do ssl models have d\\\backslash\’ej\\\backslash\a vu? a case of unintended memorization in self-supervised learning,” arXiv preprint arXiv:2304.13850, 2023.
  8. J. Abascal, S. Wu, A. Oprea, and J. Ullman, “Tmi! finetuned models leak private information from their pretraining data,” arXiv preprint arXiv:2306.01181, 2023.
  9. M. Jagielski, O. Thakkar, F. Tramer, D. Ippolito, K. Lee, N. Carlini, E. Wallace, S. Song, A. G. Thakurta, N. Papernot et al., “Measuring forgetting of memorized training examples,” in The Eleventh International Conference on Learning Representations, 2022.
  10. K. Leino and M. Fredrikson, “Stolen memories: Leveraging model memorization for calibrated {{\{{White-Box}}\}} membership inference,” in 29th USENIX security symposium (USENIX Security 20), 2020, pp. 1605–1622.
  11. N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, and F. Tramer, “Membership inference attacks from first principles,” in 2022 IEEE Symposium on Security and Privacy (SP).   IEEE, 2022, pp. 1897–1914.
  12. J. Kahn, M. Rivière, W. Zheng, E. Kharitonov, Q. Xu, P. Mazaré, J. Karadayi, V. Liptchinsky, R. Collobert, C. Fuegen, T. Likhomanenko, G. Synnaeve, A. Joulin, A. Mohamed, and E. Dupoux, “Libri-light: A benchmark for ASR with limited or no supervision,” in 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020.
  13. V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: An ASR corpus based on public domain audio books,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015.   IEEE, 2015, pp. 5206–5210.
  14. A. Gulati, J. Qin, C. Chiu, N. Parmar, Y. Zhang, J. Yu, W. Han, S. Wang, Z. Zhang, Y. Wu, and R. Pang, “Conformer: Convolution-augmented transformer for speech recognition,” in Interspeech 2020, 21st Annual Conference of the International Speech Communication Association.   ISCA, 2020, pp. 5036–5040.
  15. A. Graves and A. Graves, “Connectionist temporal classification,” Supervised sequence labelling with recurrent neural networks, pp. 61–93, 2012.
  16. N. Lukas, A. Salem, R. Sim, S. Tople, L. Wutschitz, and S. Zanella-Béguelin, “Analyzing leakage of personally identifiable information in language models,” arXiv preprint arXiv:2302.00539, 2023.
  17. C. Kim, A. Misra, K. K. Chin, T. Hughes, A. Narayanan, T. N. Sainath, and M. Bacchiani, “Generation of large-scale simulated utterances in virtual rooms to train deep-neural networks for far-field speech recognition in google home,” in Interspeech 2017, 18th Annual Conference of the International Speech Communication Association.   ISCA, 2017, pp. 379–383.
  18. K. Lee, D. Ippolito, A. Nystrom, C. Zhang, D. Eck, C. Callison-Burch, and N. Carlini, “Deduplicating training data makes language models better,” arXiv preprint arXiv:2107.06499, 2021.
  19. N. Kandpal, E. Wallace, and C. Raffel, “Deduplicating training data mitigates privacy risks in language models,” in International Conference on Machine Learning.   PMLR, 2022, pp. 10 697–10 707.
  20. N. Carlini, J. Hayes, M. Nasr, M. Jagielski, V. Sehwag, F. Tramer, B. Balle, D. Ippolito, and E. Wallace, “Extracting training data from diffusion models,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023.
  21. M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” in Psychology of learning and motivation.   Elsevier, 1989, vol. 24, pp. 109–165.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Matthew Jagielski (51 papers)
  2. Om Thakkar (25 papers)
  3. Lun Wang (33 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets