Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detecting Propaganda Techniques in Code-Switched Social Media Text (2305.14534v2)

Published 23 May 2023 in cs.CL and cs.AI

Abstract: Propaganda is a form of communication intended to influence the opinions and the mindset of the public to promote a particular agenda. With the rise of social media, propaganda has spread rapidly, leading to the need for automatic propaganda detection systems. Most work on propaganda detection has focused on high-resource languages, such as English, and little effort has been made to detect propaganda for low-resource languages. Yet, it is common to find a mix of multiple languages in social media communication, a phenomenon known as code-switching. Code-switching combines different languages within the same text, which poses a challenge for automatic systems. With this in mind, here we propose the novel task of detecting propaganda techniques in code-switched text. To support this task, we create a corpus of 1,030 texts code-switching between English and Roman Urdu, annotated with 20 propaganda techniques, which we make publicly available. We perform a number of experiments contrasting different experimental setups, and we find that it is important to model the multilinguality directly (rather than using translation) as well as to use the right fine-tuning strategy. The code and the dataset are publicly available at https://github.com/mbzuai-nlp/propaganda-codeswitched-text

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Aimlab. 2022. XLM RoBERTa fine-tuned on Roman Urdu. https://huggingface.co/Aimlab/xlm-roberta-roman-urdu-finetuned.
  2. Overview of the WANLP 2022 shared task on propaganda detection in Arabic. In Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP), pages 108–118, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  3. “bend the truth”: Benchmark dataset for fake news detection in urdu language and its evaluation. Journal of Intelligent & Fuzzy Systems, 39(2):2457–2469.
  4. Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Computational linguistics, 34(4):555–596.
  5. Sentiment and emotion analysis of text: A survey on approaches and resources. Language & Technology, 87.
  6. Paper bullets: Modeling propaganda with the help of metaphor. In Findings of the Association for Computational Linguistics: EACL 2023, pages 472–489, Dubrovnik, Croatia. Association for Computational Linguistics.
  7. Proppy: Organizing the news based on their propagandistic content. Information Processing & Management, 56(5):1849–1864.
  8. Calcs 2021 shared task: Machine translation for code-switched data. arXiv preprint arXiv:2202.09625.
  9. Richard G Cole. 1975. The reformation in print: German pamphlets and propaganda. Archiv für Reformationsgeschichte-Archive for Reformation History, 66(jg):93–102.
  10. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online. Association for Computational Linguistics.
  11. Semeval-2020 task 11: Detection of propaganda techniques in news articles. arXiv preprint arXiv:2009.02696.
  12. Fine-grained analysis of propaganda in news article. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5636–5646, Hong Kong, China. Association for Computational Linguistics.
  13. Lavinia Dan. 2015. Techniques for the translation of advertising slogans. In Proceedings of the International Conference Literature, Discourse and Multicultural Dialogue, LDMD, volume 15, pages 13–23.
  14. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  15. Detecting propaganda techniques in memes. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6603–6617, Online. Association for Computational Linguistics.
  16. SemEval-2021 task 6: Detection of persuasion techniques in texts and images. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 70–98, Online. Association for Computational Linguistics.
  17. Jean Goodwin and Raymie McKerrow. 2011. Accounting for the force of the appeal to authority.
  18. Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. ArXiv, abs/2111.09543.
  19. Deberta: Decoding-enhanced bert with disentangled attention. ArXiv, abs/2006.03654.
  20. Renee Hobbs and Sandra McGee. 2014. Teaching about propaganda: An examination of the historical roots of media literacy. Journal of Media Literacy Education, 6(2):56–66.
  21. John Hunter. 2015. Brainwashing in a large group awareness training?: the classical conditioning hypothesis of brainwashing. Ph.D. thesis.
  22. A survey of current datasets for code-switching research. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pages 136–141.
  23. Rubert: A bilingual roman urdu bert using cross lingual transfer learning. arXiv preprint arXiv:2102.11278.
  24. Named entity dataset for urdu named entity recognition task. LANGUAGE & TECHNOLOGY, 51.
  25. GLUECoS: An evaluation benchmark for code-switched NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3575–3585, Online. Association for Computational Linguistics.
  26. Klaus Krippendorff. 2011. Agreement and information in the reliability of coding. Communication Methods and Measures, 5:112 – 93.
  27. Victor Margolin. 1979. The visual rhetoric of propaganda. Information design journal, 1(2):107–122.
  28. Clyde R. Miller. 1939. The techniques of propaganda. https://userfiles-secure.educatorpages.com/userfiles/jcarr/Propaganda%20Techniques.pdf.
  29. Chad Nilep. 2006. “code switching” in sociocultural linguistics. Colorado research in linguistics.
  30. SemEval-2023 task 3: Detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup. In The 17th International Workshop on Semantic Evaluation (SemEval-2023). Association for Computational Linguistics.
  31. Multilingual multifaceted understanding of online news in terms of genre, framing, and persuasion techniques. Association for Computational Linguistics.
  32. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 conference on empirical methods in natural language processing, pages 2931–2937.
  33. Monika L Richter. 2017. The kremlin’s platform for ‘useful idiots’ in the west: An overview of rt’s editorial strategy and evidence of impact. European Values. URL: http://www. europeanvalues. net/rt/(31.01. 2018).
  34. Hate-speech and offensive language detection in Roman Urdu. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2512–2522, Online. Association for Computational Linguistics.
  35. Carol Myers Scotton. 1982. The possibility of code-switching: motivation for maintaining multilingualism. Anthropological linguistics, pages 432–444.
  36. Mary WJ Tay. 1989. Code switching and code mixing as a communicative strategy in multilingual discourse. World Englishes, 8(3):407–417.
  37. Gabriel H Teninbaum. 2009. Reductio ad hitlerum: Trumping the judicial nazi card. Mich. St. L. Rev., page 541.
  38. Robyn Torok. 2015. Symbiotic radicalisation strategies: Propaganda tools and neuro linguistic programming.
  39. Anthony Weston. 2018a. A Rulebook for Arguments. Hackett Publishing.
  40. Anthony Weston. 2018b. A rulebook for arguments. Hackett Publishing.
  41. Julius Yourman. 1939. Propaganda techniques within nazi germany. The Journal of Educational Sociology, 13(3):148–163.
  42. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1218–1227, Huhhot, China. Chinese Information Processing Society of China.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com