Disentangling the Linguistic Competence of Privacy-Preserving BERT (2310.11363v1)
Abstract: Differential Privacy (DP) has been tailored to address the unique challenges of text-to-text privatization. However, text-to-text privatization is known for degrading the performance of LLMs when trained on perturbed text. Employing a series of interpretation techniques on the internal representations extracted from BERT trained on perturbed pre-text, we intend to disentangle at the linguistic level the distortion induced by differential privacy. Experimental results from a representational similarity analysis indicate that the overall similarity of internal representations is substantially reduced. Using probing tasks to unpack this dissimilarity, we find evidence that text-to-text privatization affects the linguistic competence across several formalisms, encoding localized properties of words while falling short at encoding the contextual relationships between spans of words.
- Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pages 308–318.
- Higher-order comparisons of sentence encoder representations. arXiv preprint arXiv:1909.00303.
- Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. arXiv preprint arXiv:1608.04207.
- Guiding text-to-text privatization by syntax. In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023), pages 151–162, Toronto, Canada. Association for Computational Linguistics.
- Private empirical risk minimization: Efficient algorithms and tight error bounds. In 2014 IEEE 55th annual symposium on foundations of computer science, pages 464–473. IEEE.
- What do neural machine translation models learn about morphology? arXiv preprint arXiv:1704.03471.
- Plausible deniability for privacy-preserving data synthesis. arXiv preprint arXiv:1708.07975.
- Deep rnns encode soft hierarchical syntax. arXiv preprint arXiv:1805.04218.
- On identifiability in transformers. arXiv preprint arXiv:1908.04211.
- The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19), pages 267–284.
- Broadening the scope of differential privacy using metrics. In International Symposium on Privacy Enhancing Technologies Symposium, pages 82–102. Springer.
- Probing bert in hyperbolic spaces. arXiv preprint arXiv:2104.03869.
- A customized text sanitization mechanism with differential privacy. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5747–5758, Toronto, Canada. Association for Computational Linguistics.
- What does bert look at? an analysis of bert’s attention. arXiv preprint arXiv:1906.04341.
- What you can cram into a single vector: Probing sentence embeddings for linguistic properties. arXiv preprint arXiv:1805.01070.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pages 265–284. Springer.
- Amnesic probing: Behavioral explanation with amnesic counterfactuals. Transactions of the Association for Computational Linguistics, 9:160–175.
- Allyson Ettinger. 2020. What bert is not: Lessons from a new suite of psycholinguistic diagnostics for language models. Transactions of the Association for Computational Linguistics, 8:34–48.
- Generalised differential privacy for text document processing. In International Conference on Principles of Security and Trust, pages 123–148. Springer, Cham.
- Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations. In Proceedings of the 13th International Conference on Web Search and Data Mining, pages 178–186.
- Yoav Goldberg. 2019. Assessing bert’s syntactic abilities. arXiv preprint arXiv:1901.05287.
- Colorless green recurrent networks dream hierarchically. arXiv preprint arXiv:1803.11138.
- Conditional probing: measuring usable information beyond a baseline. arXiv preprint arXiv:2109.09234.
- John Hewitt and Percy Liang. 2019. Designing and interpreting probes with control tasks. arXiv preprint arXiv:1909.03368.
- John Hewitt and Christopher D Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138.
- Do attention heads in bert track syntactic dependencies? arXiv preprint arXiv:1911.12246.
- Visualisation and’diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure. Journal of Artificial Intelligence Research, 61:907–926.
- Contrastive explanations for model interpretability. arXiv preprint arXiv:2103.01378.
- Sarthak Jain and Byron C Wallace. 2019. Attention is not explanation. arXiv preprint arXiv:1902.10186.
- What does bert learn about the structure of language? In ACL 2019-57th Annual Meeting of the Association for Computational Linguistics.
- Jae-young Jo and Sung-Hyon Myaeng. 2020. Roles and utilization of attention heads in transformer-based neural language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3404–3417.
- Revealing the dark secrets of bert. arXiv preprint arXiv:1908.08593.
- Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, 2:4.
- The emergence of number and syntax units in lstm language models. arXiv preprint arXiv:1903.07435.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- Multi-head attention with disagreement regularization. arXiv preprint arXiv:1810.10183.
- Tomasz Limisiewicz and David Mareček. 2020. Introducing orthogonal constraint in structural probes. arXiv preprint arXiv:2012.15228.
- Open sesame: getting inside bert’s linguistic knowledge. arXiv preprint arXiv:1906.01698.
- Assessing the ability of lstms to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 4:521–535.
- Linguistic knowledge and transferability of contextual representations. arXiv preprint arXiv:1903.08855.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Rebecca Marvin and Tal Linzen. 2018. Targeted syntactic evaluation of language models. arXiv preprint arXiv:1808.09031.
- The limits of word level differential privacy. arXiv preprint arXiv:2205.02130.
- Learned in translation: Contextualized word vectors. Advances in neural information processing systems, 30.
- Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963.
- What happens to bert embeddings during fine-tuning? arXiv preprint arXiv:2004.14448.
- On the nature of bert: Correlating fine-tuning and linguistic competence. In Proceedings of the 29th International Conference on Computational Linguistics, pages 3109–3119.
- Pointer sentinel mixture models.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- Privacy risks of general-purpose language models. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1314–1331. IEEE.
- Deep contextualized word representations.
- Dissecting contextual word embeddings: Architecture and representation. arXiv preprint arXiv:1808.08949.
- Natural language understanding with privacy-preserving bert. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 1488–1497.
- Improving language understanding by generative pre-training.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
- Probing the probing paradigm: Does probing accuracy entail task relevance? arXiv preprint arXiv:2005.00719.
- Attention can reflect syntactic structure (if you let it). arXiv preprint arXiv:2101.10927.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
- Naomi Saphra and Adam Lopez. 2018. Understanding learning dynamics of language models with svcca. arXiv preprint arXiv:1811.00225.
- Sofia Serrano and Noah A Smith. 2019. Is attention interpretable? arXiv preprint arXiv:1906.03731.
- Selective differential privacy for language modeling. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2848–2859, Seattle, United States. Association for Computational Linguistics.
- Congzheng Song and Ananth Raghunathan. 2020. Information leakage in embedding models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pages 377–390.
- Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing, pages 245–248. IEEE.
- Investigating transferability in pretrained language models. arXiv preprint arXiv:2004.14975.
- Bert rediscovers the classical nlp pipeline. arXiv preprint arXiv:1905.05950.
- What do you learn from context? probing for sentence structure in contextualized word representations. arXiv preprint arXiv:1905.06316.
- Investigating the impact of pre-trained word embeddings on memorization in neural networks. In International Conference on Text, Speech, and Dialogue, pages 273–281. Springer.
- Attention is all you need. Advances in neural information processing systems, 30.
- Jesse Vig and Yonatan Belinkov. 2019. Analyzing the structure of attention in a transformer language model. arXiv preprint arXiv:1906.04284.
- The bottom-up evolution of representations in the transformer: A study with machine translation and language modeling objectives. arXiv preprint arXiv:1909.01380.
- Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
- Stanley L Warner. 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69.
- A non-linear structural probe. arXiv preprint arXiv:2105.10185.
- Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
- A differentially private text perturbation method using a regularized mahalanobis metric. arXiv preprint arXiv:2010.11947.
- Differential privacy for text analytics via natural text sanitization. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3853–3866, Online. Association for Computational Linguistics.