Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 33 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 202 tok/s Pro
GPT OSS 120B 429 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Supplementing Missing Visions via Dialog for Scene Graph Generations (2204.11143v2)

Published 23 Apr 2022 in cs.CV

Abstract: Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks. However, the classic task setup rarely considers the challenging, yet common practical situations where the complete visual data may be inaccessible due to various reasons (e.g., restricted view range and occlusions). To this end, we investigate a computer vision task setting with incomplete visual input data. Specifically, we exploit the Scene Graph Generation (SGG) task with various levels of visual data missingness as input. While insufficient visual input intuitively leads to performance drop, we propose to supplement the missing visions via the natural language dialog interactions to better accomplish the task objective. We design a model-agnostic Supplementary Interactive Dialog (SI-Dial) framework that can be jointly learned with most existing models, endowing the current AI systems with the ability of question-answer interactions in natural language. We demonstrate the feasibility of such a task setting with missing visual input and the effectiveness of our proposed dialog module as the supplementary information source through extensive experiments and analysis, by achieving promising performance improvement over multiple baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. “Scene graph generation by iterative message passing,” in CVPR, 2017.
  2. “Neural motifs: Scene graph parsing with global context,” in CVPR, 2018.
  3. “Graph r-cnn for scene graph generation,” in ECCV, 2018.
  4. “Scene graph generation with external knowledge and image reconstruction,” in CVPR, 2019.
  5. “Counterfactual critic multi-agent training for scene graph generation,” in ICCV, 2019.
  6. “Scene graph generation from objects, phrases and region captions,” in ICCV, 2017.
  7. “Graphical contrastive losses for scene graph parsing,” in ICCV, 2019.
  8. “Visual dialog,” in CVPR, 2017.
  9. “Guesswhat?! visual object discovery through multi-modal dialogue,” in CVPR, 2017.
  10. “Recursive visual attention in visual dialog,” in CVPR, 2019.
  11. “Learning cooperative visual dialog agents with deep reinforcement learning,” in ICCV, 2017.
  12. “Visual reference resolution using attention memory for visual dialog,” in NeurIPS, 2017.
  13. “A study of face obfuscation in imagenet,” ICML, 2022.
  14. “Audio visual scene-aware dialog,” in CVPR, 2019.
  15. “Vqa: Visual question answering,” in ICCV, 2015.
  16. “Stacked attention networks for image question answering,” in CVPR, 2016.
  17. “Dynamic memory networks for visual and textual question answering,” in ICML, 2016.
  18. “Hierarchical question-image co-attention for visual question answering,” in NeurIPS, 2016.
  19. “Learning to compose dynamic tree structures for visual contexts,” in CVPR, 2019.
  20. “Saying the unseen: Video descriptions via dialog agents,” in TPAMI, 2021.
  21. “Sentence-bert: Sentence embeddings using siamese bert-networks,” in EMNLP, 2019.
  22. “Making monolingual sentence embeddings multilingual using knowledge distillation,” in EMNLP, 2020.
  23. “Describing unseen videos via multi-modal cooperative dialog agents,” in ECCV, 2020.
  24. “Factor graph attention,” in CVPR, 2019.
  25. “Unbiased scene graph generation from biased training,” in CVPR, 2020.
  26. “Visual genome: Connecting language and vision using crowdsourced dense image annotations,” IJCV, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.