Emergent Mind

Abstract

Automatic speaker verification (ASV) is the process to recognize persons using voice as biometric. The ASV systems show considerable recognition performance with sufficient amount of speech from matched condition. One of the crucial challenges of ASV technology is to improve recognition performance with speech segments of short duration. In short duration condition, the model parameters are not properly estimated due to inadequate speech information, and this results poor recognition accuracy even with the state-of-the-art i-vector based ASV system. We hypothesize that considering the estimation quality during recognition process would help to improve the ASV performance. This can be incorporated as a quality measure during fusion of ASV systems. This paper investigates a new quality measure for i-vector representation of speech utterances computed directly from Baum-Welch statistics. The proposed metric is subsequently used as quality measure during fusion of ASV systems. In experiments with the NIST SRE 2008 corpus, We have shown that inclusion of proposed quality metric exhibits considerable improvement in speaker verification performance. The results also indicate the potentiality of the proposed method in real-world scenario with short test utterances.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.