Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Tensor Composition Net for Visual Relationship Prediction (2012.05473v2)

Published 10 Dec 2020 in cs.CV

Abstract: We present a novel Tensor Composition Net (TCN) to predict visual relationships in images. Visual Relationship Prediction (VRP) provides a more challenging test of image understanding than conventional image tagging and is difficult to learn due to a large label-space and incomplete annotation. The key idea of our TCN is to exploit the low-rank property of the visual relationship tensor, so as to leverage correlations within and across objects and relations and make a structured prediction of all visual relationships in an image. To show the effectiveness of our model, we first empirically compare our model with Multi-Label Image Classification (MLIC) methods, eXtreme Multi-label Classification (XMC) methods, and VRD methods. We then show that thanks to our tensor (de)composition layer, our model can predict visual relationships which have not been seen in the training dataset. We finally show our TCN's image-level visual relationship prediction provides a simple and efficient mechanism for relation-based image-retrieval even compared with VRD methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuting Qiang (4 papers)
  2. Yongxin Yang (73 papers)
  3. Xueting Zhang (3 papers)
  4. Yanwen Guo (41 papers)
  5. Timothy M. Hospedales (69 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.