Emergent Mind

Beamformer-Guided Target Speaker Extraction

(2303.08702)
Published Mar 15, 2023 in eess.AS and cs.SD

Abstract

We propose a Beamformer-guided Target Speaker Extraction (BG-TSE) method to extract a target speaker's voice from a multi-channel recording informed by the direction of arrival of the target. The proposed method employs a front-end beamformer steered towards the target speaker to provide an auxiliary signal to a single-channel TSE system. By allowing for time-varying embeddings in the single-channel TSE block, the proposed method fully exploits the correspondence between the front-end beamformer output and the target speech in the microphone signal. Experimental evaluation on simulated multi-channel 2-speaker mixtures, in both anechoic and reverberant conditions, demonstrates the advantage of the proposed method compared to recent single-channel and multi-channel baselines.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.