PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music (2207.07336v4)

Published 15 Jul 2022 in eess.AS, cs.SD, and eess.SP

Abstract: Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, i.e. a singing vocal extraction front end, followed by a lyrics transcriber back end, where the front end and back end are trained separately. Such a two-step pipeline suffers from both imperfect vocal extraction and mismatch between front end and back end. In this work, we propose a novel end-to-end integrated fine-tuning framework, that we call PoLyScriber, to globally optimize the vocal extractor front end and lyrics transcriber back end for lyrics transcription in polyphonic music. The experimental results show that our proposed PoLyScriber achieves substantial improvements over the existing approaches on publicly available test datasets.

Authors (3)

Xiaoxue Gao (21 papers)
Chitralekha Gupta (15 papers)
Haizhou Li (286 papers)

Citations (6)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music (2207.07336v4)

Summary

Related Papers