Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 27 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 23 tok/s Pro
GPT-5 High 29 tok/s Pro
GPT-4o 70 tok/s Pro
Kimi K2 117 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4 34 tok/s Pro
2000 character limit reached

Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation Exploitation (2208.00219v1)

Published 30 Jul 2022 in cs.CV, cs.AI, cs.LG, and cs.MM

Abstract: Few-shot object detection has been extensively investigated by incorporating meta-learning into region-based detection frameworks. Despite its success, the said paradigm is still constrained by several factors, such as (i) low-quality region proposals for novel classes and (ii) negligence of the inter-class correlation among different classes. Such limitations hinder the generalization of base-class knowledge for the detection of novel-class objects. In this work, we design Meta-DETR, which (i) is the first image-level few-shot detector, and (ii) introduces a novel inter-class correlational meta-learning strategy to capture and leverage the correlation among different classes for robust and accurate few-shot object detection. Meta-DETR works entirely at image level without any region proposals, which circumvents the constraint of inaccurate proposals in prevalent few-shot detection frameworks. In addition, the introduced correlational meta-learning enables Meta-DETR to simultaneously attend to multiple support classes within a single feedforward, which allows to capture the inter-class correlation among different classes, thus significantly reducing the misclassification over similar classes and enhancing knowledge generalization to novel classes. Experiments over multiple few-shot object detection benchmarks show that the proposed Meta-DETR outperforms state-of-the-art methods by large margins. The implementation codes are available at https://github.com/ZhangGongjie/Meta-DETR.

Citations (73)

Summary

  • The paper introduces Meta-DETR, which eliminates region proposals by employing a DETR-based image-level few-shot detection framework.
  • It utilizes an inter-class correlational meta-learning strategy that processes multiple support classes simultaneously to reduce misclassification.
  • Meta-DETR achieves superior mAP on benchmark datasets, demonstrating robust performance even with minimal labeled data.

Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation Exploitation

The paper presents Meta-DETR, a novel approach in the field of few-shot object detection, which distinguishes itself from existing methodologies primarily by operating at the image level without relying on region proposals. Traditional approaches to few-shot detection leverage region-based frameworks like Faster R-CNN, which suffer from deficiencies in region proposal quality for novel classes. Meta-DETR addresses this limitation by utilizing a DETR-based framework, facilitating pure image-level prediction. This key deviation enables Meta-DETR to sidestep inaccuracies inherent in region-based predictions, thereby offering more robust detection capabilities for novel objects.

A critical aspect of Meta-DETR is the incorporation of an inter-class correlational meta-learning strategy. This element allows the model to effectively discern and leverage correlations among different classes during training. Unlike previous approaches that treat each support class independently, Meta-DETR processes multiple support classes simultaneously. This strategy not only enhances generalization capabilities by recognizing cross-class relationships but also significantly reduces misclassification among similar classes.

The paper reports that Meta-DETR achieves superior performance on several few-shot object detection benchmarks, including Pascal VOC and MS COCO, outperforming state-of-the-art methods by substantial margins. Numerical results highlight significant improvements in detection mAP across varying shot settings, underscoring Meta-DETR's efficacy in learning from minimal labeled data.

Practically, the implications of Meta-DETR are considerable. By eliminating dependency on region proposals, the model provides a more robust generalization framework even with extremely limited samples. Moreover, the ability to exploit inter-class correlations could be beneficial in real-world applications where novel object categories frequently appear, and annotations are scarce.

Theoretically, the success of Meta-DETR emphasizes the potential of image-level frameworks in few-shot detection and the utility of correlational learning strategies. As the research community continues to explore few-shot and zero-shot learning paradigms, Meta-DETR sets a precedent for future models to leverage holistic image features and class relationships to improve learning efficiency and accuracy.

Future research may explore integrating multi-scale features into the Meta-DETR framework, potentially improving detection capabilities for small or occluded objects. Furthermore, extending the correlational meta-learning strategy to other vision tasks, such as segmentation or tracking, provides a promising avenue for expanding the framework's applicability.

In conclusion, Meta-DETR represents a significant advancement in few-shot object detection by foregoing traditional region-based methodologies and embracing an image-level approach accompanied by inter-class correlation exploitation. This innovative framework not only enhances the adaptability and accuracy of object detectors but also provides a strong foundation for further research and development in the discipline.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.