Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model (1911.03952v2)

Published 10 Nov 2019 in cs.SD and eess.AS

Abstract: Nowadays vast amounts of speech data are recorded from low-quality recorder devices such as smartphones, tablets, laptops, and medium-quality microphones. The objective of this research was to study the automatic generation of high-quality speech from such low-quality device-recorded speech, which could then be applied to many speech-generation tasks. In this paper, we first introduce our new device-recorded speech dataset then propose an improved end-to-end method for automatically transforming the low-quality device-recorded speech into professional high-quality speech. Our method is an extension of a generative adversarial network (GAN)-based speech enhancement model called speech enhancement GAN (SEGAN), and we present two modifications to make model training more robust and stable. Finally, from a large-scale listening test, we show that our method can significantly enhance the quality of device-recorded speech signals.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Seyyed Saeed Sarfjoo (8 papers)
  2. Xin Wang (1307 papers)
  3. Gustav Eje Henter (51 papers)
  4. Jaime Lorenzo-Trueba (33 papers)
  5. Shinji Takaki (16 papers)
  6. Junichi Yamagishi (178 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.