Anomaly detection through latent space restoration using vector-quantized variational autoencoders (2012.06765v1)

Published 12 Dec 2020 in cs.CV, cs.LG, and eess.IV

Abstract: We propose an out-of-distribution detection method that combines density and restoration-based approaches using Vector-Quantized Variational Auto-Encoders (VQ-VAEs). The VQ-VAE model learns to encode images in a categorical latent space. The prior distribution of latent codes is then modelled using an Auto-Regressive (AR) model. We found that the prior probability estimated by the AR model can be useful for unsupervised anomaly detection and enables the estimation of both sample and pixel-wise anomaly scores. The sample-wise score is defined as the negative log-likelihood of the latent variables above a threshold selecting highly unlikely codes. Additionally, out-of-distribution images are restored into in-distribution images by replacing unlikely latent codes with samples from the prior model and decoding to pixel space. The average L1 distance between generated restorations and original image is used as pixel-wise anomaly score. We tested our approach on the MOOD challenge datasets, and report higher accuracies compared to a standard reconstruction-based approach with VAEs.

Citations (51)

View on Semantic Scholar

Summary

The paper introduces a VQ-VAE method that leverages latent space restoration with PixelSNAIL for precise anomaly detection.
The approach employs a discrete latent space that improves image reconstruction and anomaly scoring in medical imagery.
Experimental results on MOOD datasets demonstrate superior accuracy compared to traditional VAE methods.

Anomaly Detection Through Latent Space Restoration with Vector Quantized Variational Autoencoders

Introduction to Anomaly Detection in Medical Imagery

Anomaly detection in medical images plays a critical role in identifying conditions deviating from a defined norm. Traditional methods rely heavily on supervised learning, requiring extensive annotated datasets and often failing to generalize across unfamiliar pathologies. Recent advancements have shifted focus towards unsupervised learning, with Variational Auto-Encoders (VAEs) gaining traction for their ability to learn and reconstruct the underlying distribution of standard images, thereby flagging anomalies.

Vector Quantized Variational Auto-Encoders (VQ-VAEs)

The paper explores the use of Vector Quantized Variational Auto-Encoders (VQ-VAEs) for anomaly detection. Unlike traditional VAEs that encode data into a continuous latent space, VQ-VAEs assign input data to a finite set of embeddings in a discrete latent space, significantly enhancing the quality of image reconstructions. The discrete nature of this space allows for highly expressive Auto-Regressive (AR) models to accurately learn prior distributions, facilitating efficient anomaly detection.

Implementing VQ-VAEs and Auto-Regressive Models

The proposed method employs a VQ-VAE architecture with a categorical latent space, modeled by PixelSNAIL, an AR model known for its state-of-the-art performance in density estimation. This innovation enables the computation of sample and pixel-wise anomaly scores by leveraging the likelihood estimation provided by the AR model. Notably, the methodology includes a unique approach to anomaly localization dubbed "Latent Space Restoration", which involves the substitution of unlikely latent codes with samples from the prior model, followed by the decoding process to generate restorations that are subsequently compared to the original image for anomaly scoring.

Experimental Validation and Results

The efficacy of the proposed method was validated using the MOOD challenge datasets, encompassing brain MR and abdominal CT images. The analysis demonstrated superior accuracy of the VQ-VAE approach in identifying anomalies when compared to standard VAE models, particularly in sample-wise anomaly detection. This performance is attributed to the discretionary power of the VQ-VAE’s discrete latent space, coupled with the PixelSNAIL’s adeptness at AR modeling.

Technical Specifications and Findings

Detailed implementation aspects are provided, including the architecture of the VQ-VAE and PixelSNAIL, the loss function specifics, and the algorithm for computing anomaly scores. The method's sensitivity to anomaly pixel intensity, an area identified for future improvement, highlights its potential for further refinement.

Implications and Future Directions

This research punctuates the potential of combining VQ-VAEs with AR models for unsupervised anomaly detection in medical images, offering a robust framework that surpasses traditional VAE-based approaches. The findings not only contribute to the theoretical understanding of anomaly detection mechanisms but also promise significant advancements in the practical application of AI in medical diagnostics. Future endeavors may explore alternative anomaly scoring techniques and extend the validation of this method across a broader spectrum of medical conditions and imaging modalities.

Acknowledgements and Ethical Compliance

The paper acknowledges the data sources and confirms compliance with ethical standards concerning the retrospective use of human subject data, ensuring the integrity of the research process.

In summarizing, the research presented embarks on a sophisticated methodological pathway toward enhancing anomaly detection in medical imagery, encapsulating the synergy between VQ-VAEs and AR modeling. As the field progresses, such innovations will likely become pivotal in harnessing the full potential of AI in healthcare diagnostics.

PDF Markdown