Hybrid Y-Net Architecture for Singing Voice Separation (2303.02599v1)
Abstract: This research paper presents a novel deep learning-based neural network architecture, named Y-Net, for achieving music source separation. The proposed architecture performs end-to-end hybrid source separation by extracting features from both spectrogram and waveform domains. Inspired by the U-Net architecture, Y-Net predicts a spectrogram mask to separate vocal sources from a mixture signal. Our results demonstrate the effectiveness of the proposed architecture for music source separation with fewer parameters. Overall, our work presents a promising approach for improving the accuracy and efficiency of music source separation.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.