Emergent Mind

Kernelized Deep Convolutional Neural Network for Describing Complex Images

(1509.04581)
Published Sep 15, 2015 in cs.CV , cs.AI , cs.IR , and cs.MM

Abstract

With the impressive capability to capture visual content, deep convolutional neural networks (CNN) have demon- strated promising performance in various vision-based ap- plications, such as classification, recognition, and objec- t detection. However, due to the intrinsic structure design of CNN, for images with complex content, it achieves lim- ited capability on invariance to translation, rotation, and re-sizing changes, which is strongly emphasized in the s- cenario of content-based image retrieval. In this paper, to address this problem, we proposed a new kernelized deep convolutional neural network. We first discuss our motiva- tion by an experimental study to demonstrate the sensitivi- ty of the global CNN feature to the basic geometric trans- formations. Then, we propose to represent visual content with approximate invariance to the above geometric trans- formations from a kernelized perspective. We extract CNN features on the detected object-like patches and aggregate these patch-level CNN features to form a vectorial repre- sentation with the Fisher vector model. The effectiveness of our proposed algorithm is demonstrated on image search application with three benchmark datasets.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.