Disentangling the Linguistic Competence of Privacy-Preserving BERT (2310.11363v1)

Published 17 Oct 2023 in cs.CL

Abstract: Differential Privacy (DP) has been tailored to address the unique challenges of text-to-text privatization. However, text-to-text privatization is known for degrading the performance of LLMs when trained on perturbed text. Employing a series of interpretation techniques on the internal representations extracted from BERT trained on perturbed pre-text, we intend to disentangle at the linguistic level the distortion induced by differential privacy. Experimental results from a representational similarity analysis indicate that the overall similarity of internal representations is substantially reduced. Using probing tasks to unpack this dissimilarity, we find evidence that text-to-text privatization affects the linguistic competence across several formalisms, encoding localized properties of words while falling short at encoding the contextual relationships between spans of words.

References (77)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Disentangling the Linguistic Competence of Privacy-Preserving BERT (2310.11363v1)

Summary

Related Papers