Multiscale Positive-Unlabeled Detection of AI-Generated Texts (2305.18149v4)
Abstract: Recent releases of LLMs, e.g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts. Previous works proposed methods to detect these AI-generated texts, including simple ML classifiers, pretrained-model-based zero-shot methods, and finetuned language classification models. However, mainstream detectors always fail on short texts, like SMSes, Tweets, and reviews. In this paper, a Multiscale Positive-Unlabeled (MPU) training framework is proposed to address the difficulty of short-text detection without sacrificing long-texts. Firstly, we acknowledge the human-resemblance property of short machine texts, and rephrase AI text detection as a partial Positive-Unlabeled (PU) problem by regarding these short machine texts as partially ``unlabeled". Then in this PU context, we propose the length-sensitive Multiscale PU Loss, where a recurrent model in abstraction is used to estimate positive priors of scale-variant corpora. Additionally, we introduce a Text Multiscaling module to enrich training corpora. Experiments show that our MPU method augments detection performance on long AI-generated texts, and significantly improves short-text detection of LLM detectors. LLMs trained with MPU could outcompete existing detectors on various short-text and long-text detection benchmarks. The codes are available at https://github.com/mindspore-lab/mindone/tree/master/examples/detect_chatgpt and https://github.com/YuchuanTian/AIGC_text_detector.
- Generating sentiment-preserving fake online reviews using neural language models and their human- and machine-based detection. In Leonard Barolli, Flora Amato, Francesco Moscato, Tomoya Enokido, and Makoto Takizawa, editors, Advanced Information Networking and Applications - Proceedings of the 34th International Conference on Advanced Information Networking and Applications, AINA-2020, Caserta, Italy, 15-17 April, volume 1151 of Advances in Intelligent Systems and Computing, pages 1341–1354. Springer, 2020. doi: 10.1007/978-3-030-44041-1_114. URL https://doi.org/10.1007/978-3-030-44041-1_114.
- Learning from positive and unlabeled data: A survey. Machine Learning, 109:719–760, 2020.
- Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nature methods, 16(11):1153–1160, 2019.
- Language models are few-shot learners. CoRR, abs/2005.14165, 2020. URL https://arxiv.org/abs/2005.14165.
- Self-pu: Self boosted and calibrated positive-unlabeled training. In International Conference on Machine Learning, pages 1510–1519. PMLR, 2020.
- Adversarial robustness of neural-statistical features in detection of generative transformers. In International Joint Conference on Neural Networks, IJCNN 2022, Padua, Italy, July 18-23, 2022, pages 1–8. IEEE, 2022. doi: 10.1109/IJCNN55064.2022.9892269. URL https://doi.org/10.1109/IJCNN55064.2022.9892269.
- Revisiting pre-trained models for Chinese natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 657–668, Online, November 2020. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/2020.findings-emnlp.58.
- BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805, 2018. URL http://arxiv.org/abs/1810.04805.
- Analysis of learning from positive and unlabeled data. Advances in neural information processing systems, 27, 2014.
- Tweepfake: about detecting deepfake tweets. CoRR, abs/2008.00036, 2020. URL https://arxiv.org/abs/2008.00036.
- FudanNLPLab. Sniffer. Website, 2023. sniffer.fastnlp.top.
- GLTR: statistical detection and visualization of generated text. In Marta R. Costa-jussà and Enrique Alfonseca, editors, Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28 - August 2, 2019, Volume 3: System Demonstrations, pages 111–116. Association for Computational Linguistics, 2019. doi: 10.18653/v1/p19-3019. URL https://doi.org/10.18653/v1/p19-3019.
- How close is chatgpt to human experts? comparison corpus, evaluation, and detection. CoRR, abs/2301.07597, 2023. doi: 10.48550/arXiv.2301.07597. URL https://doi.org/10.48550/arXiv.2301.07597.
- Learning from positive and unlabeled data with arbitrary positive shift. Advances in Neural Information Processing Systems, 33:13088–13099, 2020.
- Instance-dependent pu learning by bayesian optimal relabeling. arXiv preprint arXiv:1808.02180, 2018.
- Pu learning for matrix completion. In International conference on machine learning, pages 2445–2453. PMLR, 2015.
- Positive and unlabeled learning in categorical data. Neurocomputing, 196:113–124, 2016.
- Opinion mining using ensemble text hidden markov models for text classification. Expert Syst. Appl., 94:218–227, 2018. doi: 10.1016/j.eswa.2017.07.019. URL https://doi.org/10.1016/j.eswa.2017.07.019.
- Learning from positive and unlabeled data with a selection bias. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=rJzLciCqKm.
- Positive-unlabeled learning with non-negative risk estimator. Advances in neural information processing systems, 30, 2017.
- Stylometric detection of ai-generated text in twitter timelines. CoRR, abs/2303.03697, 2023. doi: 10.48550/arXiv.2303.03697. URL https://doi.org/10.48550/arXiv.2303.03697.
- Building text classifiers using positive and unlabeled examples. In Third IEEE international conference on data mining, pages 179–186. IEEE, 2003.
- Coco: Coherence-enhanced machine-generated text detection under data limitation with contrastive learning. CoRR, abs/2212.10341, 2022. doi: 10.48550/arXiv.2212.10341. URL https://doi.org/10.48550/arXiv.2212.10341.
- Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019. URL http://arxiv.org/abs/1907.11692.
- Recurrent neural network based language model. In Interspeech, volume 2, pages 1045–1048. Makuhari, 2010.
- Detectgpt: Zero-shot machine-generated text detection using probability curvature. CoRR, abs/2301.11305, 2023. doi: 10.48550/arXiv.2301.11305. URL https://doi.org/10.48550/arXiv.2301.11305.
- Chatgpt or human? detect and explain. explaining decisions of machine learning model for detecting short chatgpt-generated text. CoRR, abs/2301.13852, 2023. doi: 10.48550/arXiv.2301.13852. URL https://doi.org/10.48550/arXiv.2301.13852.
- OpenAI. Introducing chatgpt. Website, 2022. https://openai.com/blog/chatgpt.
- OpenAI. Gpt-4 technical report, 2023a.
- OpenAI. Ai text classifier - openai api. Website, January 2023b. https://platform.openai.com/ai-text-classifier.
- Distantly supervised named entity recognition using positive-unlabeled learning. arXiv preprint arXiv:1906.01378, 2019.
- Language models are unsupervised multitask learners. 2019.
- Laplacian unit-hyperplane learning from positive and unlabeled examples. Information Sciences, 314:152–168, 2015.
- Release strategies and the social impacts of language models. CoRR, abs/1908.09203, 2019. URL http://arxiv.org/abs/1908.09203.
- Positive-unlabeled learning from imbalanced data. In IJCAI, pages 2995–3001, 2021.
- LSTM neural networks for language modeling. In INTERSPEECH 2012, 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, September 9-13, 2012, pages 194–197. ISCA, 2012. URL http://www.isca-speech.org/archive/interspeech_2012/i12_0194.html.
- Positive-unlabeled learning with adversarial data augmentation for knowledge graph completion. arXiv preprint arXiv:2205.00904, 2022.
- Edward Tian. Gptzero. Website, 2022. https://gptzero.me/faq.
- EDA: easy data augmentation techniques for boosting performance on text classification tasks. CoRR, abs/1901.11196, 2019. URL http://arxiv.org/abs/1901.11196.
- Positive-unlabeled compression on the cloud. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 2561–2570, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/ac796a52db3f16bbdb6557d3d89d1c5a-Abstract.html.
- Defending against neural fake news. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 9051–9062, 2019. URL https://proceedings.neurips.cc/paper/2019/hash/3e9f0fc9b2f89e043bc6233994dfcf76-Abstract.html.