FreeA: Human-object Interaction Detection using Free Annotation Labels (2403.01840v2)

Published 4 Mar 2024 in cs.CV and cs.AI

Abstract: Recent human-object interaction (HOI) detection methods depend on extensively annotated image datasets, which require a significant amount of manpower. In this paper, we propose a novel self-adaptive, language-driven HOI detection method, termed FreeA. This method leverages the adaptability of the text-image model to generate latent HOI labels without requiring manual annotation. Specifically, FreeA aligns image features of human-object pairs with HOI text templates and employs a knowledge-based masking technique to decrease improbable interactions. Furthermore, FreeA implements a proposed method for matching interaction correlations to increase the probability of actions associated with a particular action, thereby improving the generated HOI labels. Experiments on two benchmark datasets showcase that FreeA achieves state-of-the-art performance among weakly supervised HOI competitors. Our proposal gets +\textbf{13.29} (\textbf{159\%$\uparrow$}) mAP and +\textbf{17.30} (\textbf{98\%$\uparrow$}) mAP than the newest Weakly'' supervised model, and +\textbf{7.19} (\textbf{28\%$\uparrow$}) mAP and +\textbf{14.69} (\textbf{34\%$\uparrow$}) mAP than the latestWeakly+'' supervised model, respectively, on HICO-DET and V-COCO datasets, more accurate in localizing and classifying the interactive actions. The source code will be made public.

References (55)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

FreeA: Human-object Interaction Detection using Free Annotation Labels (2403.01840v2)

Summary

Related Papers