Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 39 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models (2401.13611v1)

Published 24 Jan 2024 in cs.SD, cs.AI, and eess.AS

Abstract: Neural networks have been successfully used for non-intrusive speech intelligibility prediction. Recently, the use of feature representations sourced from intermediate layers of pre-trained self-supervised and weakly-supervised models has been found to be particularly useful for this task. This work combines the use of Whisper ASR decoder layer representations as neural network input features with an exemplar-based, psychologically motivated model of human memory to predict human intelligibility ratings for hearing-aid users. Substantial performance improvement over an established intrusive HASPI baseline system is found, including on enhancement systems and listeners unseen in the training data, with a root mean squared error of 25.3 compared with the baseline of 28.7.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. World Health Organization, “Addressing the Rising Prevalence of Hearing Loss,” 2018, ISBN: 9789241550260.
  2. “Personalized Acoustic Interfaces for Human-Computer Interaction,” in Human-Centered Design of E-Health Technologies: Concepts, Methods and Applications, M. Ziefle and C.Röcker, Eds., chapter 8, pp. 180–207. IGI Global, 2011.
  3. World Health Organisation, “Ageing and Health,” https://www.who.int/news-room/fact-sheets/detail/ageing-and-health, Accesssed: 2023-07-26.
  4. “Multichannel Signal Enhancement Algorithms for Assisted Listening Devices: Exploiting spatial diversity using multiple microphones,” IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 18–30, 2015.
  5. “Hands-Free Telecommunication for Elderly Persons Suffering from Hearing Deficiencies,” in IEEE Int. Conf. on E-Health Networking, Application and Services (Healthcom’10), 2010.
  6. “Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices: Advantages and Limitations of Existing Tools,” IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 114–124, 2015.
  7. “Subjective Speech Quality and Speech Intelligibility Evaluation of Single-Channel Dereverberation Algorithms,” in Int. Workshop on Acoustic Signal Enhancement (IWAENC 2014), France, Sep. 2014.
  8. “Comparing Binaural Pre-processing Strategies III: Speech Intelligibility of Normal-Hearing and Hearing-Impaired Listeners,” Trends in Hearing, vol. 19, 2015.
  9. “Non-Intrusive Speech Quality Prediction Using Modulation Energies and LSTM-Network,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 7, pp. 1151–1163, July 2019.
  10. “The 2nd Clarity Prediction Challenge: A machine learning challenge for hearing aid intelligibility prediction,” in ICASSP, 2024.
  11. “The 1st Clarity Prediction Challenge: A machine learning challenge for hearing aid intelligibility prediction,” in Proc. Interspeech, 2022, pp. 3508–3512.
  12. “Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction,” in Proc. Interspeech, 2022, pp. 3493–3497.
  13. “MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids,” 2022.
  14. “Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations,” in Proc. Workshop on Speech Foundation Models and their Performance Benchmarks (SPARKS), ASRU sattelite workshop, Taipei, Taiwan, 2023.
  15. “Non-intrusive Speech Intelligibility Metric Prediction for Hearing Impaired Individuals,” in Proc. Interspeech, 2022, pp. 3483–3487.
  16. “The Hearing-aid Speech Perception Index (HASPI) Version 2,” Speech Communication, vol. 131, pp. 35–46, 2021.
  17. “Robust Speech Recognition via Large-Scale Weak Supervision,” 2022.
  18. “Attention is All You Need,” Advances in neural information processing systems, vol. 30, 2017.
  19. “Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications,” in Interspeech. Sep 2022, ISCA.
  20. D. Hintzman, “MINERVA 2: a Simulation Model of Human Memory,” Behaviour Research Methods, Instruments & Computers, vol. 16, pp. 96–101, 03 1984.
  21. “Rule-Plus-Exception Model of Classification Learning,” Psychological Review, vol. 101, no. 1, pp. 53–79, 1994.
  22. “Rules and Exemplars in Category Learning,” Journal of Experimental Psychology: General, vol. 127, 1998.
  23. J. N. Rouder and R. Ratcliff, “Comparing Exemplar- and Rule-Based Theories of Categorization,” Current Directions in Psychological Science, vol. 15, 2006.
  24. “Generalization of Feature- and Rule-based Learning in the Categorization of Dimensional Stimuli: Evidence for Dual Processes Under Cognitive Control,” J Exp Psychol Anim Behav Process, vol. 39, no. 2, pp. 140–51, 2013.
  25. “Effects of better-ear glimpsing, binaural unmasking, and spectral resolution on spatial release from masking in cochlear-implant users,” The Journal of the Acoustical Society of America, vol. 152, no. 2, pp. 1230–1246, 08 2022.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets