Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 220 tok/s Pro
GPT OSS 120B 473 tok/s Pro
Claude Sonnet 4 40 tok/s Pro
2000 character limit reached

On Adversarial Examples for Text Classification by Perturbing Latent Representations (2405.03789v1)

Published 6 May 2024 in cs.LG, cs.AI, cs.CL, and cs.CR

Abstract: Recently, with the advancement of deep learning, several applications in text classification have advanced significantly. However, this improvement comes with a cost because deep learning is vulnerable to adversarial examples. This weakness indicates that deep learning is not very robust. Fortunately, the input of a text classifier is discrete. Hence, it can prevent the classifier from state-of-the-art attacks. Nonetheless, previous works have generated black-box attacks that successfully manipulate the discrete values of the input to find adversarial examples. Therefore, instead of changing the discrete values, we transform the input into its embedding vector containing real values to perform the state-of-the-art white-box attacks. Then, we convert the perturbed embedding vector back into a text and name it an adversarial example. In summary, we create a framework that measures the robustness of a text classifier by using the gradients of the classifier.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998, 2018.
  2. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  3. Hotflip: White-box adversarial examples for text classification. arXiv preprint arXiv:1712.06751, 2017.
  4. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  5. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  6. Adversarial example generation with syntactically controlled paraphrase networks. arXiv preprint arXiv:1804.06059, 2018.
  7. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328, 2017.
  8. Is bert really robust? a strong baseline for natural language attack on text classification and entailment. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 8018–8025, 2020.
  9. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  10. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  11. The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P), pages 372–387. IEEE, 2016.
  12. Enhancing adversarial examples on deep q networks with previous information. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pages 01–07. IEEE, 2021.
  13. Evaluation of adversarial attacks sensitivity of classifiers with occluded input data. Neural Computing and Applications, pages 1–18, 2022.
  14. Evaluating accuracy and adversarial robustness of quanvolutional neural networks. In 2021 International Conference on Computational Science and Computational Intelligence (CSCI), pages 152–157. IEEE, 2021.
  15. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828–841, 2019.
  16. Learning to attack: Towards textual adversarial attacking in real-world situations. arXiv preprint arXiv:2009.09192, 2020.
  17. Character-level convolutional networks for text classification, 2015.
  18. Generating natural adversarial examples. arXiv preprint arXiv:1710.11342, 2017.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets