Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 41 tok/s Pro
GPT-5 High 39 tok/s Pro
GPT-4o 89 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 437 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Analysis of Deep Feature Loss based Enhancement for Speaker Verification (2002.00139v2)

Published 1 Feb 2020 in eess.AS and cs.SD

Abstract: Data augmentation is conventionally used to inject robustness in Speaker Verification systems. Several recently organized challenges focus on handling novel acoustic environments. Deep learning based speech enhancement is a modern solution for this. Recently, a study proposed to optimize the enhancement network in the activation space of a pre-trained auxiliary network. This methodology, called deep feature loss, greatly improved over the state-of-the-art conventional x-vector based system on a children speech dataset called BabyTrain. This work analyzes various facets of that approach and asks few novel questions in that context. We first search for optimal number of auxiliary network activations, training data, and enhancement feature dimension. Experiments reveal the importance of Signal-to-Noise Ratio filtering that we employ to create a large, clean, and naturalistic corpus for enhancement network training. To counter the "mismatch" problem in enhancement, we find enhancing front-end (x-vector network) data helpful while harmful for the back-end (Probabilistic Linear Discriminant Analysis (PLDA)). Importantly, we find enhanced signals contain complementary information to original. Established by combining them in front-end, this gives ~40% relative improvement over the baseline. We also do an ablation study to remove a noise class from x-vector data augmentation and, for such systems, we establish the utility of enhancement regardless of whether it has seen that noise class itself during training. Finally, we design several dereverberation schemes to conclude ineffectiveness of deep feature loss enhancement scheme for this task.

Citations (13)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.