Channel Recurrent Attention Networks for Video Pedestrian Retrieval (2010.03108v1)

Published 7 Oct 2020 in cs.CV and cs.LG

Abstract: Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed {\it channel recurrent attention network}, for the task of video pedestrian retrieval. The main attention unit, \textit{channel recurrent attention}, identifies attention maps at the frame level by jointly leveraging spatial and channel patterns via a recurrent neural network. This channel recurrent attention is designed to build a global receptive field by recurrently receiving and learning the spatial vectors. Then, a \textit{set aggregation} cell is employed to generate a compact video representation. Empirical experimental results demonstrate the superior performance of the proposed deep network, outperforming current state-of-the-art results across standard video person retrieval benchmarks, and a thorough ablation study shows the effectiveness of the proposed units.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Channel Recurrent Attention Networks for Video Pedestrian Retrieval (2010.03108v1)

Summary

Related Papers