Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 164 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 72 tok/s Pro
Kimi K2 204 tok/s Pro
GPT OSS 120B 450 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

Disentangled Interaction Representation for One-Stage Human-Object Interaction Detection (2312.01713v1)

Published 4 Dec 2023 in cs.CV

Abstract: Human-Object Interaction (HOI) detection is a core task for human-centric image understanding. Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction; however, the interaction representations obtained using this method are entangled and lack interpretability. In contrast, traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner. In this paper, we improve the performance of one-stage methods by enabling them to extract disentangled interaction representations. First, we propose Shunted Cross-Attention (SCA) to extract human appearance, object appearance, and global context features using different cross-attention heads. This is achieved by imposing different masks on the cross-attention maps produced by the different heads. Second, we introduce the Interaction-aware Pose Estimation (IPE) task to learn interaction-relevant human pose features using a disentangled decoder. This is achieved with a novel attention module that accurately captures the human keypoints relevant to the current interaction category. Finally, our approach fuses the appearance feature and pose feature via element-wise addition to form the interaction representation. Experimental results show that our approach can be readily applied to existing one-stage HOI detectors. Moreover, we achieve state-of-the-art performance on two benchmarks: HICO-DET and V-COCO.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube