Emergent Mind

A Survey on Out-of-Distribution Detection in NLP

(2305.03236)
Published May 5, 2023 in cs.CL , cs.AI , and cs.LG

Abstract

Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world. Great progress has been made over the past years. This paper presents the first review of recent advances in OOD detection with a particular focus on natural language processing approaches. First, we provide a formal definition of OOD detection and discuss several related fields. We then categorize recent algorithms into three classes according to the data they used: (1) OOD data available, (2) OOD data unavailable + in-distribution (ID) label available, and (3) OOD data unavailable + ID label unavailable. Third, we introduce datasets, applications, and metrics. Finally, we summarize existing work and present potential future research topics.

Taxonomy categorizing out-of-distribution detection methods.

Overview

  • Out-of-Distribution (OOD) detection is essential for the robustness and reliability of NLP models, focusing on identifying inputs that significantly differ from the model's training data.

  • The paper offers a novel classification of OOD detection methods in NLP, based on the availability of data and discusses datasets, applications, metrics, and future research directions.

  • It highlights different strategies based on whether OOD data, in-distribution data, or no labels at all are available, including leveraging labeled OOD and ID data, generating pseudo-OOD samples, and unsupervised learning methods.

  • Future research areas are identified, emphasizing the importance of integrating OOD detection with domain generalization, leveraging extra information sources, and theoretical explorations within OOD detection.

A Comprehensive Survey on Out-of-Distribution Detection in Natural Language Processing

Introduction to OOD Detection in NLP

Out-of-Distribution (OOD) detection has emerged as a pivotal aspect for ensuring the robustness and reliability of machine learning models, especially in NLP applications. Anomaly detection, a subset of OOD detection, identifies inputs that significantly diverge from the model's training distribution, thus posing potential challenges in real-world deployment of AI systems. This paper presents a systematic review of advancements in OOD detection specifically tailored to NLP, proposing a novel classification of OOD detection methods based on data availability and discussing datasets, applications, metrics, and future directions in the context of AI safety in NLP.

Methodological Classifications

OOD Data Available

Methods assuming access to both in-distribution (ID) and OOD data during model training are further divided into two:

  • Detection with Extensive OOD Data: Techniques in this category leverage labeled OOD data alongside ID data to refine model learning, catering to scenarios where extensive OOD samples are available for training.
  • Detection with Few OOD Data: These methods, acknowledging the impracticality of acquiring large-scale labeled OOD datasets, focus on generating pseudo-OOD samples from a limited set of real OOD instances.

OOD Data Unavailable + ID Label Available

In absence of OOD data, several strategies have been developed to exploit labeled ID data exclusively:

  • Learn Representations Then Detect: Approaches here aim at extracting discriminative features conducive to differentiating ID from OOD samples and subsequently scoring these samples for detection.
  • Generate Pseudo OOD Samples: This strategy revolves around simulating OOD samples using various data augmentation and generation techniques, effectively circumventing the absence of real OOD instances.
  • Other Approaches: This includes various innovative techniques that do not neatly fit into the two aforementioned categories.

OOD Data Unavailable + ID Label Unavailable

This setting is akin to unsupervised learning challenges, focusing on anomaly detection without labeled data. Techniques here primarily aim to learn robust representations that inherently segregate ID and OOD data.

Datasets and Applications

The paper categorizes OOD detection datasets based on their construction of OOD instances and discusses prevalent applications across NLP tasks, highlighting the diverse utility and necessity of OOD detection methodologies in enhancing AI safety and reliability in language-based models.

Evaluation Metrics

It presents an overview of standard metrics employed in assessing the performance of OOD detectors, such as AUROC, AUPR, FPR@N, among others, emphasizing their role in providing comprehensive evaluations of model efficacy in OOD detection.

Future Directions

The paper underscores areas of potential research, including the integration of OOD detection with domain generalization, leveraging extra information sources, and the amalgamation of OOD detection with lifelong learning frameworks. It also points out the need for theoretical explorations within the realm of OOD detection.

Concluding Remarks

Through presenting a structured analysis of OOD detection methodologies tailored to NLP, this paper sheds light on the complexities and nuances inherent in ensuring AI systems' robustness against OOD inputs. By delineating current strategies, datasets, applications, and future directions, it contributes a foundational framework that supports ongoing and future research endeavors aimed at fortifying AI against the challenges posed by OOD instances.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.