Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models (2201.12675v2)

Published 29 Jan 2022 in cs.LG, cs.CL, and cs.CR

Abstract: A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy. However, previous work has shown that the gradient updates used in FL can leak user information. While the most industrial uses of FL are for text applications (e.g. keystroke prediction), nearly all attacks on FL privacy have focused on simple image classifiers. We propose a novel attack that reveals private user text by deploying malicious parameter vectors, and which succeeds even with mini-batches, multiple users, and long sequences. Unlike previous attacks on FL, the attack exploits characteristics of both the Transformer architecture and the token embedding, separately extracting tokens and positional embeddings to retrieve high-fidelity text. This work suggests that FL on text, which has historically been resistant to privacy attacks, is far more vulnerable than previously thought.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Liam Fowl (25 papers)
  2. Jonas Geiping (73 papers)
  3. Steven Reich (9 papers)
  4. Yuxin Wen (33 papers)
  5. Wojtek Czaja (4 papers)
  6. Micah Goldblum (96 papers)
  7. Tom Goldstein (226 papers)
Citations (44)

Summary

We haven't generated a summary for this paper yet.