Emergent Mind

Abstract

Deep speaker embeddings have been shown effective for assessing cognitive impairments aside from their original purpose of speaker verification. However, the research found that speaker embeddings encode speaker identity and an array of information, including speaker demographics, such as sex and age, and speech contents to an extent, which are known confounders in the assessment of cognitive impairments. In this paper, we hypothesize that content information separated from speaker identity using a framework for voice conversion is more effective for assessing cognitive impairments and train simple classifiers for the comparative analysis on the DementiaBank Pitt Corpus. Our results show that while content embeddings have an advantage over speaker embeddings for the defined problem, further experiments show their effectiveness depends on information encoded in speaker embeddings due to the inherent design of the architecture used for extracting contents.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.