Multiscale Positive-Unlabeled Detection of AI-Generated Texts (2305.18149v4)

Published 29 May 2023 in cs.CL and cs.AI

Abstract: Recent releases of LLMs, e.g. ChatGPT, are astonishing at generating human-like texts, but they may impact the authenticity of texts. Previous works proposed methods to detect these AI-generated texts, including simple ML classifiers, pretrained-model-based zero-shot methods, and finetuned language classification models. However, mainstream detectors always fail on short texts, like SMSes, Tweets, and reviews. In this paper, a Multiscale Positive-Unlabeled (MPU) training framework is proposed to address the difficulty of short-text detection without sacrificing long-texts. Firstly, we acknowledge the human-resemblance property of short machine texts, and rephrase AI text detection as a partial Positive-Unlabeled (PU) problem by regarding these short machine texts as partially ``unlabeled". Then in this PU context, we propose the length-sensitive Multiscale PU Loss, where a recurrent model in abstraction is used to estimate positive priors of scale-variant corpora. Additionally, we introduce a Text Multiscaling module to enrich training corpora. Experiments show that our MPU method augments detection performance on long AI-generated texts, and significantly improves short-text detection of LLM detectors. LLMs trained with MPU could outcompete existing detectors on various short-text and long-text detection benchmarks. The codes are available at https://github.com/mindspore-lab/mindone/tree/master/examples/detect_chatgpt and https://github.com/YuchuanTian/AIGC_text_detector.

References (41)

Citations (34)

View on Semantic Scholar

Summary

The paper proposes a novel MPU framework using a multiscale PU loss to significantly enhance detection of AI-generated texts, particularly for short forms.
It employs a text multiscaling module combined with a recurrent model to capture length-sensitive, token-wide signals in text.
The method outperforms baselines on datasets like TweepFake and HC3, substantially improving F1 scores for short-text benchmarks.

AI-Generated Text Detection Using Multiscale Positive-Unlabeled Learning

The paper "Multiscale Positive-Unlabeled Detection of AI-Generated Texts" offers a novel approach to address the considerable challenges faced in detecting AI-generated texts, particularly short texts. It presents the Multiscale Positive-Unlabeled (MPU) training framework, an innovative methodology designed to enhance the detection performance on short texts while also maintaining efficacy for longer ones. This approach is essential due to the increasing sophistication of LLMs such as GPT-4, which can generate human-like text that complicates the task of distinguishing it from human-authored content.

Problem Context

AI-generated texts can be misleading, especially when used in unethical or illegal contexts. While existing methods, like simple classifiers or finetuned models, perform reasonably well for longer texts, they frequently fail on shorter texts such as tweets or SMS messages. These short texts are ubiquitous in today's digital communication landscape, prompting the need for improved detection methods.

Approach and Methodology

This research distinguishes itself by reframing AI text detection as a Positive-Unlabeled (PU) problem where short AI-generated texts are treated as "unlabeled" due to their high resemblance to human texts. The proposed MPU training framework leverages a Multiscale PU loss that adjusts based on the length of the text, allowing it to address discrepancies in text detection across varying lengths. Specifically:

Multiscale PU Loss: This is a length-sensitive loss function that estimates positive priors differently for texts of varying lengths, using a recurrent model in abstraction. The recurrent model is designed to capture human-likeness in texts progressively, based on token-wide signals.
Text Multiscaling Module: This module augments the dataset by generating multiple length variations of training texts through random sentence deletion. This step is key to ensuring that the model is exposed to texts of all lengths during training.

These components combine to significantly improve the detection of short AI-generated texts without compromising the performance on longer ones.

Results

The effectiveness of the MPU method was validated through experiments on datasets like TweepFake and HC3, spanning languages including English and Chinese. Remarkably, the MPU method outperformed leading baselines in detecting AI-generated texts, even competing with newer approaches like DetectGPT. Notably, on short-text benchmarks such as HC3-English-Sentence, MPU enhanced the F1 score substantially, evidencing improved detector performance for short text classifications.

Implications and Future Directions

This research holds meaningful implications for the future of AI text detection and the broader field of AI ethics. By enhancing the accuracy of detectors for shorter texts, the MPU framework provides an advanced tool for combating misinformation and protecting against social engineering attacks using AI-generated content. The work also suggests avenues for further exploration, such as refining the instantiation of length-sensitive priors or expanding this framework to other modalities of languages or semi-structured data.

The introduction of a framework that caters to the nuanced task of multiscale text detection also raises questions about the potential application of comparable PU learning strategies in other domains within AI, where data labeling challenges parallel those in text detection. Future research may explore unsupervised and semi-supervised learning paradigms, further refining detection capabilities in rapidly evolving contexts.

In conclusion, the paper delineates a practical and theoretically informed methodology that pushes forward the capabilities of AI-generated text detection, aligning well with the demands of modern digital media environments. As LLMs become more sophisticated, frameworks like MPU will be crucial in maintaining the integrity and authenticity of digital communication.

PDF Markdown

GitHub

GitHub - YuchuanTian/AIGC_text_detector: AIGC_text_detector (283 stars)