Emergent Mind

Benchmarking Object Detectors with COCO: A New Path Forward

(2403.18819)
Published Mar 27, 2024 in cs.CV

Abstract

The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz

COCO-ReM enhances COCO-2017 mask quality, addressing occlusions and providing exhaustive instance annotations.

Overview

  • COCO-ReM significantly enhances COCO 2017 by refining instance mask annotations, enabling more accurate and reliable benchmarking of object detectors and segmentation models.

  • Through manual inspection, various annotation flaws were identified in COCO 2017, such as imprecise mask boundaries and inconsistent handling of occlusions, which COCO-ReM addresses.

  • The COCO-ReM dataset introduces a three-stage annotation pipeline including mask boundary refinement using SAM, exhaustive instance annotation, and correction of labeling errors.

  • Experimental evaluation showed that models achieve higher AP scores on COCO-ReM, with query-based models demonstrating notably better performance, highlighting the importance of data quality for advancing object detector performance.

COCO-ReM: Enhancing COCO 2017 with Refined Instance Masks for Object Detection

Introduction

The Common Objects in Context (COCO) dataset has been a foundational benchmark for object detection and segmentation models. However, the COCO 2017 version exhibits several annotation imperfections, such as imprecise mask boundaries, inconsistent occlusion handling, and non-exhaustive instance annotations. These imperfections can undermine the reliability of benchmarking performance of state-of-the-art models. In response, this paper presents COCO-ReM (Refined Masks), a substantial improvement over COCO 2017 that features high-quality instance annotations designed to enable more accurate and reliable model benchmarking.

Revisiting COCO Masks

A comprehensive manual inspection of COCO 2017 annotations revealed various annotation flaws. Specifically, the mask boundaries were often coarse, occlusions were inconsistently handled, instances were sometimes non-exhaustively annotated, and there were near-duplicate masks with high overlap but differing labels. These issues highlight the necessity for refined annotations to accurately assess the performance of object detectors and segmentation models.

COCO-ReM: Dataset and Annotation Pipeline

COCO-ReM introduces refined instance masks for COCO images, addressing the identified annotation imperfections. The annotation pipeline consists of three stages:

  1. Mask Boundary Refinement: Using the Segment Anything Model (SAM), mask boundaries were refined to be visually more precise.
  2. Exhaustive Instance Annotation: Instances were exhaustively annotated by importing additional instances from the LVIS dataset and using predictions from LVIS-trained models, significantly augmenting the COCO 2017 annotations.
  3. Correction of Labeling Errors: Manual verification focused on correcting labeling errors and ensuring consistent annotation quality.

Experimental Validation

Fifty object detectors, encompassing both region-based and query-based models, were evaluated using COCO-ReM. The findings demonstrate that:

  • Models achieve higher AP scores on COCO-ReM compared to COCO 2017, indicating that previous benchmarks might have penalized models due to annotation imperfections rather than model inadequacy.
  • Query-based models, which predict visually sharper masks, score significantly higher on COCO-ReM. This suggests that COCO-ReM provides a more accurate and sensitive benchmark for comparing model performance.

Implications and Future Directions

The development of COCO-ReM emphasizes the critical role of data quality in advancing object detector performance. This dataset not only facilitates more reliable benchmarking but also underscores potential avenues for future research, including the exploration of query-based models and the continuous refinement of datasets. For sustained progress in AI, the community must prioritize the refinement of benchmark datasets alongside model development.

COCO-ReM represents a step forward in enhancing the utility of the COCO dataset for future object detection research. By addressing annotation imperfections, this refined dataset sets a new standard for benchmarking object detectors, enabling fairer and more accurate evaluation of model capabilities.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.