Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Attention Based Generative Adversarial Network with Phase Post-Processing for Speech Enhancement (2006.07530v1)

Published 13 Jun 2020 in cs.SD and eess.AS

Abstract: The generative adversarial networks (GANs) have facilitated the development of speech enhancement recently. Nevertheless, the performance advantage is still limited when compared with state-of-the-art models. In this paper, we propose a powerful Dynamic Attention Recursive GAN called DARGAN for noise reduction in the time-frequency domain. Different from previous works, we have several innovations. First, recursive learning, an iterative training protocol, is used in the generator, which consists of multiple steps. By reusing the network in each step, the noise components are progressively reduced in a step-wise manner. Second, the dynamic attention mechanism is deployed, which helps to re-adjust the feature distribution in the noise reduction module. Third, we exploit the deep Griffin-Lim algorithm as the module for phase postprocessing, which facilitates further improvement in speech quality. Experimental results on Voice Bank corpus show that the proposed GAN achieves state-of-the-art performance than previous GAN- and non-GAN-based models

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Andong Li (34 papers)
  2. Chengshi Zheng (40 papers)
  3. Renhua Peng (7 papers)
  4. Cunhang Fan (35 papers)
  5. Xiaodong Li (146 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.