Emergent Mind

Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks

(2101.08919)
Published Jan 22, 2021 in eess.AS , cs.CR , and cs.SD

Abstract

As users increasingly rely on cloud-based computing services, it is important to ensure that uploaded speech data remains private. Existing solutions rely either on server-side methods or focus on hiding speaker identity. While these approaches reduce certain security concerns, they do not give users client-side control over whether their biometric information is sent to the server. In this paper, we formally define client-side privacy and discuss its three unique technical challenges: (1) direct manipulation of raw data on client devices, (2) adaptability with a broad range of server-side processing models, and (3) low time and space complexity for compatibility with limited-bandwidth devices. Solving these challenges requires new models that achieve high-fidelity reconstruction, privacy preservation of sensitive personal attributes, and efficiency during training and inference. As a step towards client-side privacy for speech recognition, we investigate three techniques spanning signal processing, disentangled representation learning, and adversarial training. Through a series of gender and accent masking tasks, we observe that each method has its unique strengths, but none manage to effectively balance the trade-offs between performance, privacy, and complexity. These insights call for more research in client-side privacy to ensure a safer deployment of cloud-based speech processing services.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.