ReAct: Out-of-distribution Detection With Rectified Activations (2111.12797v1)

Published 24 Nov 2021 in cs.LG

Abstract: Out-of-distribution (OOD) detection has received much attention lately due to its practical importance in enhancing the safe deployment of neural networks. One of the primary challenges is that models often produce highly confident predictions on OOD data, which undermines the driving principle in OOD detection that the model should only be confident about in-distribution samples. In this work, we propose ReAct--a simple and effective technique for reducing model overconfidence on OOD data. Our method is motivated by novel analysis on internal activations of neural networks, which displays highly distinctive signature patterns for OOD distributions. Our method can generalize effectively to different network architectures and different OOD detection scores. We empirically demonstrate that ReAct achieves competitive detection performance on a comprehensive suite of benchmark datasets, and give theoretical explication for our method's efficacy. On the ImageNet benchmark, ReAct reduces the false positive rate (FPR95) by 25.05% compared to the previous best method.

Authors (3)

Yiyou Sun (27 papers)
Chuan Guo (77 papers)
Yixuan Li (183 papers)

Citations (392)

View on Semantic Scholar

Summary

The paper introduces ReAct, a method that truncates high neural activations to reduce overconfidence on out-of-distribution inputs.
It demonstrates a 25.05% reduction in FPR95 on ImageNet, validating its effectiveness across architectures like ResNet and MobileNet.
The paper provides theoretical insights into activation distributions, paving the way for safer deployments in critical applications.

ReAct: Out-of-distribution Detection With Rectified Activations

The paper "ReAct: Out-of-distribution Detection With Rectified Activations" addresses a significant challenge in deploying neural networks in real-world applications: reliably detecting out-of-distribution (OOD) inputs. The presence of OOD data, unencountered during training, can lead to overconfident predictions by neural models, which in turn can compromise safety and effectiveness in critical tasks, such as autonomous driving or healthcare applications. This paper proposes a simple yet effective method termed ReAct, which aims to mitigate the overconfidence challenge by rectifying activation patterns within neural networks.

Key Contributions and Findings

The authors introduce ReAct, a technique that leverages the distinctive signature patterns of neural activations triggered by OOD data. These patterns are characterized by high variance and positive skewness among unit activations in OOD samples, differentiating them from in-distribution (ID) samples. The ReAct method operates by truncating the high activations at a designated threshold, maintaining the integrity of ID samples while reducing spurious activation on OOD inputs.

Empirically, ReAct showcases considerable advancements in OOD detection accuracy across a variety of benchmarks. Notably, on the ImageNet benchmark, ReAct reduces the false positive rate (FPR95) by an impressive 25.05% when compared to previous leading methods, demonstrating its efficacy in large-scale applications. The paper evaluates ReAct's performance across various network structures, including ResNet and MobileNet architectures, and finds consistent improvements in detection metrics. Particularly, the method proves to be adaptable across different OOD scoring functions, such as softmax probability and energy-based scores.

Theoretical Insights and Mechanisms

From a theoretical perspective, the paper explores why ReAct enhances OOD detection. Activation distributions of OOD samples, modeled as positively skewed Gaussian distributions, lead to higher mean activations. ReAct mitigates this by rectifying these activations, achieving a marked reduction in logit outputs and thus separating OOD scores from ID ones. The impact on model output is further investigated, revealing that ReAct selectively reduces the logit outputs from OOD inputs more than from ID inputs due to the stark contrast in activation patterns. This theoretical backing not only clarifies ReAct's efficacy but also establishes foundational principles for future OOD research.

Implications and Future Directions

Practically, ReAct offers a simple implementation that enhances the robustness of pre-trained networks without any need for re-training. This post hoc strategy aligns well with practical constraints in deploying large-scale models in dynamic environments. The method is particularly promising for applications where safety and reliability are paramount, providing a means to flag unfamiliar inputs for further scrutiny or alternative handling.

Theoretically, the paper opens avenues for deeper exploration into internal activation mechanisms and their role in differentiating ID and OOD data. Future inquiries may explore variations of ReAct across diverse data modalities beyond vision, or further refine truncation strategies that dynamically adapt to different levels of skewness and variance inherent in OOD data distributions.

In summary, this paper presents a compelling approach to one of the pivotal challenges in modern machine learning systems. ReAct's contributions to OOD detection illustrate its potential as a powerful tool for enhancing the robustness and safety of neural networks across a spectrum of applications.

PDF Markdown

Related Papers

YouTube

Show All Videos