Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 28 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 13 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 471 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot Anomaly Localization (2308.15939v2)

Published 30 Aug 2023 in cs.CV

Abstract: Contrastive Language-Image Pre-training (CLIP) models have shown promising performance on zero-shot visual recognition tasks by learning visual representations under natural language supervision. Recent studies attempt the use of CLIP to tackle zero-shot anomaly detection by matching images with normal and abnormal state prompts. However, since CLIP focuses on building correspondence between paired text prompts and global image-level representations, the lack of fine-grained patch-level vision to text alignment limits its capability on precise visual anomaly localization. In this work, we propose AnoCLIP for zero-shot anomaly localization. In the visual encoder, we introduce a training-free value-wise attention mechanism to extract intrinsic local tokens of CLIP for patch-level local description. From the perspective of text supervision, we particularly design a unified domain-aware contrastive state prompting template for fine-grained vision-language matching. On top of the proposed AnoCLIP, we further introduce a test-time adaptation (TTA) mechanism to refine visual anomaly localization results, where we optimize a lightweight adapter in the visual encoder using AnoCLIP's pseudo-labels and noise-corrupted tokens. With both AnoCLIP and TTA, we significantly exploit the potential of CLIP for zero-shot anomaly localization and demonstrate the effectiveness of AnoCLIP on various datasets.

Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-Up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube