CornerNet-Lite: Efficient Keypoint Based Object Detection (1904.08900v2)

Published 18 Apr 2019 in cs.CV

Abstract: Keypoint-based methods are a relatively new paradigm in object detection, eliminating the need for anchor boxes and offering a simplified detection framework. Keypoint-based CornerNet achieves state of the art accuracy among single-stage detectors. However, this accuracy comes at high processing cost. In this work, we tackle the problem of efficient keypoint-based object detection and introduce CornerNet-Lite. CornerNet-Lite is a combination of two efficient variants of CornerNet: CornerNet-Saccade, which uses an attention mechanism to eliminate the need for exhaustively processing all pixels of the image, and CornerNet-Squeeze, which introduces a new compact backbone architecture. Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency. CornerNet-Saccade is suitable for offline processing, improving the efficiency of CornerNet by 6.0x and the AP by 1.0% on COCO. CornerNet-Squeeze is suitable for real-time detection, improving both the efficiency and accuracy of the popular real-time detector YOLOv3 (34.4% AP at 30ms for CornerNet-Squeeze compared to 33.0% AP at 39ms for YOLOv3 on COCO). Together these contributions for the first time reveal the potential of keypoint-based detection to be useful for applications requiring processing efficiency.

Citations (170)

View on Semantic Scholar

Summary

The paper introduces CornerNet-Lite with two variants—CornerNet-Saccade and CornerNet-Squeeze—that significantly enhance inference speed and accuracy.
The paper employs an attention-inspired saccade mechanism and lightweight convolution modules to strategically reduce computational costs.
The paper demonstrates that optimized keypoint-based frameworks can facilitate real-time object detection in resource-constrained applications.

CornerNet-Lite: Efficient Keypoint-Based Object Detection

The paper presents CornerNet-Lite, an advancement in the domain of keypoint-based object detection, which achieves a notable balance between accuracy and efficiency. Object detection is a pivotal function in computer vision applications, with keypoint-based frameworks offering an intriguing approach by focusing on the detection and grouping of keypoints (such as the corners of bounding boxes) rather than utilizing anchor boxes. While CornerNet has exhibited high accuracy, it encounters substantial processing costs, which the present paper aims to address through the introduction of CornerNet-Lite.

CornerNet-Lite comprises two variants: CornerNet-Saccade and CornerNet-Squeeze, each targeting distinct efficiency challenges associated with object detection tasks.

CornerNet-Saccade employs an attention mechanism reminiscent of human vision saccades, detecting objects in cropped regions around predicted object locations from downsized images, thus optimizing computational resources by focusing only on promising areas. This approach enables a 6.0x improvement in processing speed compared to the original CornerNet, while offering a 1.0% increase in average precision (AP) on the COCO dataset. This variant is especially beneficial for offline processing applications where computational efficiency is paramount without significant accuracy compromise.

CornerNet-Squeeze, on the other hand, targets real-time detection efficiency by integrating design principles from SqueezeNet and MobileNets to reduce processing per pixel. Specifically, CornerNet-Squeeze incorporates lightweight modules like $1 \times 1$ convolutions and depth-wise separable convolutions into a compact stacked hourglass backbone architecture. This results in faster and more accurate detections compared to YOLOv3, achieving an AP of 34.4% at 30ms, surpassing YOLOv3's 33.0% AP at 39ms.

Several significant results were mentioned in the paper:

CornerNet-Saccade: Demonstrated an AP of 43.2% at 190ms per image on COCO, a gain in accuracy and reduction in inference time compared to the original CornerNet.
CornerNet-Squeeze: Achieved a better efficiency-accuracy tradeoff than YOLOv3, indicating that keypoint-based methods are suitable for real-time applications.

The implications of these findings are multifaceted:

Enhanced Processing Efficiency: CornerNet-Lite demonstrates that keypoint-based detection frameworks can be optimized to reduce computational demands significantly—beneficial for resource-constrained environments.
Practical Application in Real-Time Scenarios: With CornerNet-Squeeze showcasing improved performance over established models such as YOLO, real-time applications can benefit from higher accuracy without sacrificing speed.
Potential for Integration: With the demonstrated capabilities of CornerNet-Lite variants, future development may explore hybrid models, combining other efficient detection frameworks with localized attention processes.

The paper opens doors for further exploration in keypoint-based object detection, primarily focusing on refining attention mechanisms and architectural designs to enhance detection competencies. There remains potential for investigating more advanced integration methods or modular structures within such frameworks. Given the rapid advancements in hardware and compute capabilities, the architecture developed in CornerNet-Lite may set the foundation for future innovations in efficient object detection across various application domains in AI.

PDF Markdown

Related Papers

Objects as Points (2019)
CenterNet: Keypoint Triplets for Object Detection (2019)
CenterNet3D: An Anchor Free Object Detector for Point Cloud (2020)
CornerNet: Detecting Objects as Paired Keypoints (2018)
Corner Proposal Network for Anchor-free, Two-stage Object Detection (2020)