Papers
Topics
Authors
Recent
2000 character limit reached

UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition (2010.15521v1)

Published 29 Oct 2020 in eess.AS, cs.SD, and eess.SP

Abstract: Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is a very challenging problem and rarely investigated in previous works. This paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net and generative adversarial learning to deal with this problem. This approach consists of a generator network and a discriminator network, which operate directly in the time domain. The generator network adopts a U-Net like structure and employs dilated convolution in the bottleneck of it. We evaluate the performance of the UNetGAN at low SNR conditions (up to -20dB) on the public benchmark. The result demonstrates that it significantly improves the speech quality and substantially outperforms the representative deep learning models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).

Citations (32)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.