Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study (2404.03987v1)

Published 5 Apr 2024 in cs.CL

Abstract: For a viewpoint-diverse news recommender, identifying whether two news articles express the same viewpoint is essential. One way to determine "same or different" viewpoint is stance detection. In this paper, we investigate the robustness of operationalization choices for few-shot stance detection, with special attention to modelling stance across different topics. Our experiments test pre-registered hypotheses on stance detection. Specifically, we compare two stance task definitions (Pro/Con versus Same Side Stance), two LLM architectures (bi-encoding versus cross-encoding), and adding Natural Language Inference knowledge, with pre-trained RoBERTa models trained with shots of 100 examples from 7 different stance detection datasets. Some of our hypotheses and claims from earlier work can be confirmed, while others give more inconsistent results. The effect of the Same Side Stance definition on performance differs per dataset and is influenced by other modelling choices. We found no relationship between the number of training topics in the training shots and performance. In general, cross-encoding out-performs bi-encoding, and adding NLI training to our models gives considerable improvement, but these results are not consistent across all datasets. Our results indicate that it is essential to include multiple datasets and systematic modelling experiments when aiming to find robust modelling choices for the concept `stance'.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Topic ontologies for arguments. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1411–1427, Dubrovnik, Croatia. Association for Computational Linguistics.
  2. Towards analyzing the bias of news recommender systems using sentiment and stance detection. In Companion Proceedings of the Web Conference 2022, pages 448–457.
  3. Feta: A benchmark for few-sample task transfer in open-domain dialogue.
  4. Adversarial learning for zero-shot stance detection on social media. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4756–4767.
  5. Milad Alshomary and Henning Wachsmuth. 2019. Siamese neural network for same side stance classification. In Proceedings of the Same Side Stance Classification Shared Task organized as a part of the 6th Workshop on Argument Mining (ArgMining 2019).
  6. Topic-guided sampling for data-efficient multi-domain stance detection. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13448–13464, Toronto, Canada. Association for Computational Linguistics.
  7. Stance classification of context-dependent claims. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 251–261, Valencia, Spain. Association for Computational Linguistics.
  8. Robust integration of contextual information for cross-target stance detection. In Proceedings of the The 12th Joint Conference on Lexical and Computational Semantics (* SEM 2023), pages 494–511.
  9. Emily Bender. 2019. The# benderrule: On naming the languages we study and why it matters. The Gradient, 14:34.
  10. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632–642, Lisbon, Portugal. Association for Computational Linguistics.
  11. Seeing things from a different angle:discovering diverse perspectives about claims. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 542–557, Minneapolis, Minnesota. Association for Computational Linguistics.
  12. John W Du Bois. 2007. The stance triangle. Stancetaking in discourse: Subjectivity, evaluation, interaction, 164(3):139–182.
  13. Dynamic stance: Modeling discussions by labeling the interactions. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 6503–6515.
  14. Cross-domain label-adaptive stance detection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9011–9028, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  15. Overview of the 2022 validity and novelty prediction shared task. In Proceedings of the 9th Workshop on Argument Mining, pages 84–94.
  16. Natali Helberger. 2019. On the democratic role of news recommenders. Digital Journalism, 7(8):993–1012.
  17. Towards climate awareness in NLP research. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2480–2494, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  18. COVIDLies: Detecting COVID-19 misinformation on social media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
  19. The COVMis-Stance dataset: Stance Detection on Twitter for COVID-19 Misinformation.
  20. Spurious correlations in cross-topic argument mining. In Proceedings of* SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, pages 263–277.
  21. (mis)alignment between stance expressed in social media data and public opinion surveys. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 312–324, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  22. On classifying whether two texts are on the same side of an argument. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 10130–10138.
  23. Dilek Küçük and Fazli Can. 2020. Stance detection: A survey. ACM Computing Surveys (CSUR), 53(1):1–37.
  24. Surveying (dis) parities and concerns of compute hungry nlp research. arXiv preprint arXiv:2306.16900.
  25. Enhancing zero-shot and few-shot stance detection with commonsense knowledge graph. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3152–3157.
  26. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  27. Nudging towards news diversity: A theoretical framework for facilitating diverse news consumption through recommender design. SAGE Publications Sage UK: London, England.
  28. On the stability of fine-tuning bert: Misconceptions, explanations, and strong baselines.
  29. Operationalizing framing to support multiperspective recommendations of opinion pieces. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 478–488.
  30. Lynnette Hui Xian Ng and Kathleen M Carley. 2022. Is my stance the same as your stance? a cross validation study of stance detection datasets. Information Processing & Management, 59(6):103070.
  31. Adversarial NLI: A new benchmark for natural language understanding. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4885–4901, Online. Association for Computational Linguistics.
  32. Same side stance classification task: Facilitating argument stance classification by fine-tuning a bert model. arXiv preprint arXiv:2004.11163.
  33. Ethical considerations in NLP shared tasks. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pages 66–73, Valencia, Spain. Association for Computational Linguistics.
  34. Stancy: Stance classification based on consistency cues. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6413–6418.
  35. Intermediate-task transfer learning with pretrained language models: When and why does it work? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5231–5247.
  36. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
  37. Classification and clustering of arguments with contextualized word embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 567–578.
  38. Is stance detection topic-independent and cross-topic generalizable?-a reproduction study. In Proceedings of the 8th Workshop on Argument Mining, pages 46–56.
  39. Stance detection benchmark: How robust is your stance detection? KI-Künstliche Intelligenz, pages 1–13.
  40. Cluster & tune: Boost cold start performance in text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7639–7653.
  41. A two-sided discussion of preregistration of NLP research. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 83–93, Dubrovnik, Croatia. Association for Computational Linguistics.
  42. Cross-topic argument mining from heterogeneous sources. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3664–3674, Brussels, Belgium. Association for Computational Linguistics.
  43. Same side stance classification.
  44. Get out the vote: Determining support or opposition from congressional floor-debate transcripts. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, page 327–335, USA. Association for Computational Linguistics.
  45. Efficient Few-Shot Learning Without Prompts.
  46. Will it blend? mixing training paradigms & prompting for argument quality prediction. In Proceedings of the 9th Workshop on Argument Mining, pages 95–103.
  47. Preregistering nlp research. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 613–623.
  48. Detecting dissonant stance in social media: The role of topic exposure. In Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS), pages 151–156, Abu Dhabi, UAE. Association for Computational Linguistics.
  49. Exploring and predicting transferability across nlp tasks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7882–7926.
  50. Dive into the chasm: Probing the gap between in- and cross-topic generalization. In Findings of the Association for Computational Linguistics: EACL 2024, pages 2197–2214, St. Julian’s, Malta. Association for Computational Linguistics.
  51. Penghui Wei and Wenji Mao. 2019. Modeling transferable topics for cross-target stance detection. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1173–1176.
  52. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of NAACL-HLT, pages 1112–1122.
  53. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pages 38–45.
  54. Cross-target stance classification with self-attention networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 778–783.
  55. Song Yang and Jacopo Urbani. 2021. Tribrid: Stance classification with neural inconsistency detection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6831–6843.
  56. Stance Classification of Context-Dependent Claims. Association for Computational Linguistics.
  57. Seeing Things from a Different Angle:Discovering Diverse Perspectives about Claims. Association for Computational Linguistics. PID 10.18653/v1/N19-1053.
  58. The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants. Proceedings of NAACL-HLT 2018, 1.0, ISLRN 10.18653/v1/N18-1175.
  59. Stance Classification of Ideological Debates: Data, Models, Features, and Constraints. Asian Federation of Natural Language Processing.
  60. Semeval-2016 task 6: Detecting stance in tweets.
  61. Cross-topic Argument Mining from Heterogeneous Sources. Association for Computational Linguistics. PID 10.18653/v1/D18-1402.
  62. A Corpus for Research on Deliberation and Debate. European Language Resources Association (ELRA), ISLRN http://www.lrec-conf.org/proceedings/lrec2012/pdf/1078_Paper.pdf.
Citations (1)

Summary

We haven't generated a summary for this paper yet.