Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 22 tok/s Pro
GPT-4o 97 tok/s Pro
Kimi K2 176 tok/s Pro
GPT OSS 120B 432 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Unifying Deep Local and Global Features for Image Search (2001.05027v4)

Published 14 Jan 2020 in cs.CV

Abstract: Image retrieval is the problem of searching an image database for items that are similar to a query image. To address this task, two main types of image representations have been studied: global and local image features. In this work, our key contribution is to unify global and local features into a single deep model, enabling accurate retrieval with efficient feature extraction. We refer to the new model as DELG, standing for DEep Local and Global features. We leverage lessons from recent feature learning work and propose a model that combines generalized mean pooling for global features and attentive selection for local features. The entire network can be learned end-to-end by carefully balancing the gradient flow between two heads -- requiring only image-level labels. We also introduce an autoencoder-based dimensionality reduction technique for local features, which is integrated into the model, improving training efficiency and matching performance. Comprehensive experiments show that our model achieves state-of-the-art image retrieval on the Revisited Oxford and Paris datasets, and state-of-the-art single-model instance-level recognition on the Google Landmarks dataset v2. Code and models are available at https://github.com/tensorflow/models/tree/master/research/delf .

Citations (22)

Summary

  • The paper introduces the DELG model that unifies deep local and global feature learning within a single CNN for comprehensive image representations.
  • It employs generalized mean pooling, attentive selection, and a gradient control mechanism to efficiently balance local and global feature extraction using only image-level labels.
  • Experimental results on benchmark datasets demonstrate state-of-the-art performance and reduced latency compared to separate feature extraction systems.

In this paper, the authors address the challenge of creating a unified deep learning model for image retrieval that efficiently incorporates both local and global image features. To achieve this, they introduce the DEep Local and Global (DELG) features model, which integrates these two feature types into a single convolutional neural network (CNN) framework.

Methodology

The proposed DELG model combines recent advancements in feature learning such as generalized mean pooling for global features and attentive selection for local features. The approach leverages the hierarchical representations inherent in CNNs to simultaneously extract global and local features from different layers, allowing the model to focus on holistic image representation while retaining region-specific details.

A critical aspect of the model is its ability to be trained end-to-end using only image-level labels, which simplifies the training process. To manage the trade-off between supporting global and local feature learning within the CNN, the authors implement a gradient control mechanism that prevents disruption of desired feature representations in the hierarchical structure. This is accomplished by stopping gradient back-propagation from the local feature learning heads to the network backbone.

Additionally, the authors introduce an autoencoder-based dimensionality reduction technique specific to local features. This method bypasses traditional PCA-based post-processing, allowing compact feature representation without additional learning stages.

Experimental Results

The DELG model is evaluated on several standard image retrieval datasets including the Revisited Oxford and Paris benchmarks. It achieves state-of-the-art results, outperforming previous models that separately handle local and global features. For global-only retrieval, DELG demonstrates substantial improvements in mean average precision (mAP) on large-scale databases. With local feature re-ranking, further performance gains are realized, confirming the precision benefits of local feature matching.

The model’s efficacy is further validated on the Google Landmarks dataset for instance-level recognition, where DELG outperforms existing single-model solutions. The authors provide an analysis of memory and computation trade-offs, demonstrating that the unified model reduces latency compared to separate feature extraction systems while maintaining competitive memory usage through local feature quantization.

Implications and Future Directions

This research has significant implications for developing efficient and robust image retrieval systems. The DELG model’s ability to unify feature extraction offers potential for streamlined, integrated solutions in various computer vision tasks, beyond just image retrieval.

The novel dimension reduction technique and gradient control strategies open pathways for further exploration in hierarchical feature learning. Future research could expand on optimizing quantization methods to further alleviate memory constraints, as well as exploring the model’s applicability to other domains requiring precise image analysis, such as object detection and scene understanding.

Overall, this work provides an effective approach for combining global and local image analysis within a singular, coherent framework, setting a foundation for future advancements in the field.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com