Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection (2404.01819v1)
Abstract: In this paper, we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework, particularly focusing on the challenges posed by the quality of object queries. In DETR-based SSOD, the one-to-one assignment strategy provides inaccurate pseudo-labels, while the one-to-many assignments strategy leads to overlapping predictions. These issues compromise training efficiency and degrade model performance, especially in detecting small or occluded objects. We introduce Sparse Semi-DETR, a novel transformer-based, end-to-end semi-supervised object detection solution to overcome these challenges. Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries, significantly improving detection capabilities for small and partially obscured objects. Additionally, we integrate a Reliable Pseudo-Label Filtering Module that selectively filters high-quality pseudo-labels, thereby enhancing detection accuracy and consistency. On the MS-COCO and Pascal VOC object detection benchmarks, Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods that highlight Sparse Semi-DETR's effectiveness in semi-supervised object detection, particularly in challenging scenarios involving small or partially obscured objects.
- Active cost-aware labeling of streaming data. In International Conference on Artificial Intelligence and Statistics, pages 9117–9136. PMLR, 2023.
- End-to-end object detection with transformers. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I, pages 213–229. Springer, 2020.
- Dense learning based semi-supervised object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4805–4814, 2022.
- MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
- Recurrent glimpse-based decoder for detection with transformer. CoRR, abs/2112.04632, 2021.
- Up-detr: Unsupervised pre-training for object detection with transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1601–1610, 2020.
- The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111:98–136, 2015.
- You only look at one sequence: Rethinking transformer in vision through object detection. CoRR, abs/2106.00666, 2021.
- Fast convergence of DETR with spatially modulated co-attention. CoRR, abs/2101.07448, 2021.
- Ross Girshick. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
- Pars: Pseudo-label aware robust sample selection for learning with noisy labels. arXiv preprint arXiv:2201.10836, 2022.
- Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1780–1789, 2020.
- Scale-equivalent distillation for semi-supervised object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14502–14511, 2022.
- Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
- Pseudoprop: Robust pseudo-label generation for semi-supervised object detection in autonomous driving systems. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4390–4398, 2022.
- Consistency-based semi-supervised learning for object detection. In Neural Information Processing Systems, 2019.
- Detrs with hybrid matching. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19702–19712, 2022.
- Revisiting class imbalance for end-to-end semi-supervised object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4569–4578, 2023.
- Dn-detr: Accelerate detr training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13619–13627, 2022a.
- Pseco: Pseudo labeling and consistency training for semi-supervised object detection. In Computer Vision – ECCV 2022, pages 457–472, Cham, 2022b. Springer Nature Switzerland.
- Important object identification with semi-supervised learning for autonomous driving. In 2022 International Conference on Robotics and Automation (ICRA), pages 2913–2919. IEEE, 2022c.
- Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014.
- Feature pyramid networks for object detection. CoRR, abs/1612.03144, 2016.
- Focal loss for dense object detection. CoRR, abs/1708.02002, 2017.
- Wb-detr: Transformer-based detector without backbone. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 2959–2967, 2021a.
- DAB-DETR: dynamic anchor boxes are better queries for DETR. CoRR, abs/2201.12329, 2022a.
- Ssd: Single shot multibox detector. In Computer Vision – ECCV 2016, pages 21–37, Cham, 2016. Springer International Publishing.
- Unbiased teacher for semi-supervised object detection. In Proceedings of the International Conference on Learning Representations (ICLR), 2021b.
- Unbiased teacher v2: Semi-supervised object detection for anchor-free and anchor-based detectors, 2022b.
- Conditional DETR for fast training convergence. CoRR, abs/2108.06152, 2021.
- Adapting object size variance and class imbalance for semi-supervised object detection. In AAAI Conference on Artificial Intelligence, 2023.
- Automated detection and segmentation of hbms in 3d x-ray images using semi-supervised deep learning. In 2022 IEEE 72nd Electronic Components and Technology Conference (ECTC), pages 1890–1897, 2022.
- Evaluating the prediction bias induced by label imbalance in multi-label classification. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, page 3368–3372, New York, NY, USA, 2021. Association for Computing Machinery.
- Yolov3: An incremental improvement. CoRR, abs/1804.02767, 2018.
- You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, Los Alamitos, CA, USA, 2016. IEEE Computer Society.
- Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell., 39(6):1137–1149, 2017.
- Generalized intersection over union: A metric and A loss for bounding box regression. CoRR, abs/1902.09630, 2019.
- Sparse DETR: efficient end-to-end object detection with learnable sparsity. CoRR, abs/2111.14330, 2021.
- Claudio Filipi Gonçalves Dos Santos and João Paulo Papa. Avoiding overfitting: A survey on regularization methods for convolutional neural networks. ACM Comput. Surv., 54(10s), 2022.
- Object detection with transformers: A review, 2023.
- A simple semi-supervised learning framework for object detection. CoRR, abs/2005.04757, 2020.
- Sparse R-CNN: end-to-end object detection with learnable proposals. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 14454–14463. Computer Vision Foundation / IEEE, 2021.
- Humble teachers teach better students for semi-supervised object detection. CoRR, abs/2106.10456, 2021.
- FCOS: fully convolutional one-stage object detection. CoRR, abs/1904.01355, 2019.
- Attention is all you need. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2017.
- Focalmix: Semi-supervised learning for 3d medical image detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3951–3960, 2020.
- Double-check soft teacher for semi-supervised object detection. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 1430–1436. International Joint Conferences on Artificial Intelligence Organization, 2022a. Main Track.
- Omni-detr: Omni-supervised object detection with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9367–9376, 2022b.
- Pnp-detr: Towards efficient visual analysis with transformers. CoRR, abs/2109.07036, 2021.
- FP-DETR: Detection transformer advanced by fully pre-training. In International Conference on Learning Representations, 2022c.
- Consistent-teacher: Towards reducing inconsistent pseudo-targets in semi-supervised object detection. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3240–3249, Los Alamitos, CA, USA, 2023. IEEE Computer Society.
- Self-training with noisy student improves imagenet classification. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2020.
- End-to-end semi-supervised object detection with soft teacher. CoRR, abs/2106.09018, 2021.
- Interactive self-training with mean teachers for semi-supervised object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5937–5946, 2021.
- Towards efficient and scale-robust ultra-high-definition image demoiréing. In European Conference on Computer Vision, pages 646–662. Springer, 2022.
- mixup: Beyond empirical risk minimization. ArXiv, abs/1710.09412, 2017.
- DINO: DETR with improved denoising anchor boxes for end-to-end object detection. In The Eleventh International Conference on Learning Representations, 2023a.
- Semi-detr: Semi-supervised object detection with detection transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23809–23818, 2023b.
- Dense teacher: Dense pseudo-labels for semi-supervised object detection, 2022.
- Instant-teaching: An end-to-end semi-supervised object detection framework. CoRR, abs/2103.11402, 2021.
- Deformable {detr}: Deformable transformers for end-to-end object detection. In International Conference on Learning Representations, 2021.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.