Benchmarking Object Detectors with COCO: A New Path Forward (2403.18819v1)

Published 27 Mar 2024 in cs.CV

Abstract: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz

References (39)

Citations (1)

View on Semantic Scholar

Summary

The paper presents COCO-ReM, which refines COCO 2017 annotations by addressing mask imprecision, occlusion inconsistencies, and non-exhaustive instance labeling.
It outlines a three-stage pipeline using the Segment Anything Model, supplementary LVIS instance data, and manual corrections to improve annotation quality.
Experimental validation with 50 object detectors shows notably higher AP scores on COCO-ReM, underscoring its efficacy for fair model performance evaluation.

COCO-ReM: Enhancing COCO 2017 with Refined Instance Masks for Object Detection

Introduction

The Common Objects in Context (COCO) dataset has been a foundational benchmark for object detection and segmentation models. However, the COCO 2017 version exhibits several annotation imperfections, such as imprecise mask boundaries, inconsistent occlusion handling, and non-exhaustive instance annotations. These imperfections can undermine the reliability of benchmarking performance of state-of-the-art models. In response, this paper presents COCO-ReM (Refined Masks), a substantial improvement over COCO 2017 that features high-quality instance annotations designed to enable more accurate and reliable model benchmarking.

Revisiting COCO Masks

A comprehensive manual inspection of COCO 2017 annotations revealed various annotation flaws. Specifically, the mask boundaries were often coarse, occlusions were inconsistently handled, instances were sometimes non-exhaustively annotated, and there were near-duplicate masks with high overlap but differing labels. These issues highlight the necessity for refined annotations to accurately assess the performance of object detectors and segmentation models.

COCO-ReM: Dataset and Annotation Pipeline

COCO-ReM introduces refined instance masks for COCO images, addressing the identified annotation imperfections. The annotation pipeline consists of three stages:

Mask Boundary Refinement: Using the Segment Anything Model (SAM), mask boundaries were refined to be visually more precise.
Exhaustive Instance Annotation: Instances were exhaustively annotated by importing additional instances from the LVIS dataset and using predictions from LVIS-trained models, significantly augmenting the COCO 2017 annotations.
Correction of Labeling Errors: Manual verification focused on correcting labeling errors and ensuring consistent annotation quality.

Experimental Validation

Fifty object detectors, encompassing both region-based and query-based models, were evaluated using COCO-ReM. The findings demonstrate that:

Models achieve higher AP scores on COCO-ReM compared to COCO 2017, indicating that previous benchmarks might have penalized models due to annotation imperfections rather than model inadequacy.
Query-based models, which predict visually sharper masks, score significantly higher on COCO-ReM. This suggests that COCO-ReM provides a more accurate and sensitive benchmark for comparing model performance.

Implications and Future Directions

The development of COCO-ReM emphasizes the critical role of data quality in advancing object detector performance. This dataset not only facilitates more reliable benchmarking but also underscores potential avenues for future research, including the exploration of query-based models and the continuous refinement of datasets. For sustained progress in AI, the community must prioritize the refinement of benchmark datasets alongside model development.

COCO-ReM represents a step forward in enhancing the utility of the COCO dataset for future object detection research. By addressing annotation imperfections, this refined dataset sets a new standard for benchmarking object detectors, enabling fairer and more accurate evaluation of model capabilities.

PDF Markdown

Related Papers

GitHub

GitHub - kdexd/coco-rem: Code for the paper "Benchmarking Object Detectors with COCO: A New Path Forward." (19 stars)

Tweets

https://twitter.com/ducha_aiki/status/1773308400719278091

https://twitter.com/ionydv/status/1820208852585029844

https://twitter.com/fly51fly/status/1773466742725218511

https://twitter.com/knishimae0531/status/1773495485699141728

https://twitter.com/arxivsanitybot/status/1773530011208724562