Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Similarity Distance-Based Label Assignment for Tiny Object Detection (2407.02394v3)

Published 2 Jul 2024 in cs.CV

Abstract: Tiny object detection is becoming one of the most challenging tasks in computer vision because of the limited object size and lack of information. The label assignment strategy is a key factor affecting the accuracy of object detection. Although there are some effective label assignment strategies for tiny objects, most of them focus on reducing the sensitivity to the bounding boxes to increase the number of positive samples and have some fixed hyperparameters need to set. However, more positive samples may not necessarily lead to better detection results, in fact, excessive positive samples may lead to more false positives. In this paper, we introduce a simple but effective strategy named the Similarity Distance (SimD) to evaluate the similarity between bounding boxes. This proposed strategy not only considers both location and shape similarity but also learns hyperparameters adaptively, ensuring that it can adapt to different datasets and various object sizes in a dataset. Our approach can be simply applied in common anchor-based detectors in place of the IoU for label assignment and Non Maximum Suppression (NMS). Extensive experiments on four mainstream tiny object detection datasets demonstrate superior performance of our method, especially, 1.8 AP points and 4.1 AP points of very tiny higher than the state-of-the-art competitors on AI-TOD. Code is available at: \url{https://github.com/cszzshi/SimD}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. G. Cheng, X. Yuan, X. Yao, K. Yan, Q. Zeng, X. Xie, and J. Han, “Towards Large-Scale Small Object Detection: Survey and Benchmarks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 13467-13488, 2022.
  2. C. Xu, J. Wang, W. Yang, and L. Yu, “Dot Distance for Tiny Object Detection in Aerial Images,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021, pp. 1192-1201.
  3. J. Wang, C. Xu, W. Yang, and L. Yu, “A Normalized Gaussian Wasserstein Distance for Tiny Object Detection,” in Computer Vision and Pattern Recognition, 2021.
  4. C. Xu, J. Wang, W. Yang, H. Yu, L. Yu, and G.-S. Xia, “RFLA: Gaussian Receptive Field based Label Assignment for Tiny Object Detection,” in European Conference on Computer Vision, 2022, pp. 526-543.
  5. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
  6. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580-587.
  7. R. Girshick, “Fast R-CNN,” in 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448.
  8. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
  9. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.
  10. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. Berg, “SSD: Single Shot MultiBox Detector,” in Computer Vision–ECCV 2016: 14th European Conference, 2016, pp. 21–37.
  11. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. Zitnick, “Microsoft COCO: Common Objects in Context,” in Computer Vision–ECCV 2014: 13th European Conference, 2014, pp. 740–755.
  12. H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 658-666.
  13. Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 12993–13000, 2020.
  14. M. Kisantal, Z. Wojna, J. Murawski, J. Naruniec, and K. Cho, “Augmentation for small object detection,” in 9th International Conference on Advances in Computing and Information Technology (ACITY 2019), 2019.
  15. X. Li, W. Wang, L. Wu, S. Chen, X. Hu, J. Li, J. Tang, and J. Yang, “Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,” in Advances in Neural Information Processing Systems, 2020, pp. 21002-21012.
  16. H. Zhang, Y. Wang, F. Dayoub, and N. Sünderhauf, “VarifocalNet: An IoU-aware Dense Object Detector,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 8510-8519.
  17. C. Feng, Y. Zhong, Y. Gao, M. R. Scott, and W. Huang, “TOOD: Task-aligned One-stage Object Detection,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 3490-3499.
  18. S. Li, C. He, R. Li, and L. Zhang, “A Dual Weighting Label Assignment Scheme for Object Detection,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 9377-9386.
  19. S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, “Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 9756-9765.
  20. K. Kim and H. S. Lee, “Probabilistic Anchor Assignment with IoU Prediction for Object Detection,” in Computer Vision–ECCV 2020: 16th European Conference, 2020, pp. 355–371.
  21. Z. Ge, S. Liu, Z. Li, O. Yoshie, and J. Sun, “OTA: Optimal Transport Assignment for Object Detection,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 303-312.
  22. H. Su, Y. He, R. Jiang, J. Zhang, W. Zou, and B. Fan, “DSLA: Dynamic smooth label assignment for efficient anchor-free object detection,” Pattern Recognition, vol. 131, pp. 108868, 2022.
  23. S. Zhang, X. Zhu, Z. Lei, H. Shi, X. Wang, and SZ. Li, “S3FD: Single Shot Scale-invariant Face Detector,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 192-201.
  24. C. Xu, J. Wang, W. Yang, H. Yu, L. Yu, and G.-S. Xia, “Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 190, pp. 79–93, 2022.
  25. Z. Cai and N. Vasconcelos, “Cascade R-CNN: Delving Into High Quality Object Detection,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 6154-6162.
  26. S. Qiao, L.-C. Chen, and A. Yuille, “DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 10208-10219.
  27. J. Wang, W. Yang, H. Guo, R. Zhang, and G.-S. Xia, “Tiny Object Detection in Aerial Images,” in 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 3791-3798.
  28. P. Zhu, L. Wen, D. Du, X. Bian, H. Fan, Q. Hu, and H. Ling, “Detection and Tracking Meet Drones Challenge,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 11, pp. 7380-7399, 2022.
  29. X. Yu, Y. Gong, N. Jiang, Q. Ye, and Z. Han, “Scale Match for Tiny Person Detection,” in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 1246-1254.
  30. G. Neuhold, T. Ollmann, S. R. Bulò, and P. Kontschieder, “The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5000-5009.
  31. K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, Z. Zhang, D. Cheng, C. Zhu, T. Cheng, Q. Zhao, B. Li, X. Lu, R. Zhu, Y. Wu, J. Dai, J. Wang, J. Shi, W. Ouyang, and C. C. Loy, “MMDetection: Open MMLab Detection Toolbox and Benchmark,” in Computer Vision and Pattern Recognition, 2019.
  32. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems 32 (NeurIPS 2019), vol. 32, 2019.
  33. Y. Li, Y. Chen, N. Wang, and Z.-X. Zhang, “Scale-Aware Trident Networks for Object Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6053-6062.
  34. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal Loss for Dense Object Detection,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980-2988.
  35. Z. Yang, S. Liu, H. Hu, L. Wang, and S. Lin, “RepPoints: Point Set Representation for Object Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9656-9665.
  36. B. Zhu, J. Wang, Z. Jiang, F. Zong, S. Liu, Z. Li, and J. Sun, “AutoAssign: Differentiable Label Assignment for Dense Object Detection,” in Computer Vision and Pattern Recognition, 2020.
  37. Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-Stage Object Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 9626-9635.
  38. J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” in Computer Vision and Pattern Recognition, 2018.
  39. T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, and J. Shi, “FoveaBox: Beyound Anchor-Based Object Detection,” IEEE Transactions on Image Processing, vol. 29, pp. 7389-7398, 2020.
  40. X. Lu, B. Li, Y. Yue, Q. Li, and J. Yan, “Grid R-CNN,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7363-7372.
  41. Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series in 2021,” in Computer Vision and Pattern Recognition, 2021.

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub