Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

162 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

45 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

A Survey on Out-of-Distribution Detection in NLP (2305.03236v2)

Published 5 May 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world. Great progress has been made over the past years. This paper presents the first review of recent advances in OOD detection with a particular focus on natural language processing approaches. First, we provide a formal definition of OOD detection and discuss several related fields. We then categorize recent algorithms into three classes according to the data they used: (1) OOD data available, (2) OOD data unavailable + in-distribution (ID) label available, and (3) OOD data unavailable + ID label unavailable. Third, we introduce datasets, applications, and metrics. Finally, we summarize existing work and present potential future research topics.

References (136)

Citations (18)

View on Semantic Scholar

Summary

The paper presents a systematic review of OOD detection methods in NLP by classifying techniques based on available data.
It compares approaches that leverage extensive OOD data, generate pseudo OOD samples, and utilize unsupervised anomaly detection.
The study evaluates key metrics and highlights future research avenues to strengthen AI reliability in handling unexpected inputs.

A Comprehensive Survey on Out-of-Distribution Detection in Natural Language Processing

Introduction to OOD Detection in NLP

Out-of-Distribution (OOD) detection has emerged as a pivotal aspect for ensuring the robustness and reliability of machine learning models, especially in NLP applications. Anomaly detection, a subset of OOD detection, identifies inputs that significantly diverge from the model's training distribution, thus posing potential challenges in real-world deployment of AI systems. This paper presents a systematic review of advancements in OOD detection specifically tailored to NLP, proposing a novel classification of OOD detection methods based on data availability and discussing datasets, applications, metrics, and future directions in the context of AI safety in NLP.

Methodological Classifications

OOD Data Available

Methods assuming access to both in-distribution (ID) and OOD data during model training are further divided into two:

Detection with Extensive OOD Data: Techniques in this category leverage labeled OOD data alongside ID data to refine model learning, catering to scenarios where extensive OOD samples are available for training.
Detection with Few OOD Data: These methods, acknowledging the impracticality of acquiring large-scale labeled OOD datasets, focus on generating pseudo-OOD samples from a limited set of real OOD instances.

OOD Data Unavailable + ID Label Available

In absence of OOD data, several strategies have been developed to exploit labeled ID data exclusively:

Learn Representations Then Detect: Approaches here aim at extracting discriminative features conducive to differentiating ID from OOD samples and subsequently scoring these samples for detection.
Generate Pseudo OOD Samples: This strategy revolves around simulating OOD samples using various data augmentation and generation techniques, effectively circumventing the absence of real OOD instances.
Other Approaches: This includes various innovative techniques that do not neatly fit into the two aforementioned categories.

OOD Data Unavailable + ID Label Unavailable

This setting is akin to unsupervised learning challenges, focusing on anomaly detection without labeled data. Techniques here primarily aim to learn robust representations that inherently segregate ID and OOD data.

Datasets and Applications

The paper categorizes OOD detection datasets based on their construction of OOD instances and discusses prevalent applications across NLP tasks, highlighting the diverse utility and necessity of OOD detection methodologies in enhancing AI safety and reliability in language-based models.

Evaluation Metrics

It presents an overview of standard metrics employed in assessing the performance of OOD detectors, such as AUROC, AUPR, FPR@N, among others, emphasizing their role in providing comprehensive evaluations of model efficacy in OOD detection.

Future Directions

The paper underscores areas of potential research, including the integration of OOD detection with domain generalization, leveraging extra information sources, and the amalgamation of OOD detection with lifelong learning frameworks. It also points out the need for theoretical explorations within the field of OOD detection.

Concluding Remarks

Through presenting a structured analysis of OOD detection methodologies tailored to NLP, this paper sheds light on the complexities and nuances inherent in ensuring AI systems' robustness against OOD inputs. By delineating current strategies, datasets, applications, and future directions, it contributes a foundational framework that supports ongoing and future research endeavors aimed at fortifying AI against the challenges posed by OOD instances.

PDF Markdown