Papers
Topics
Authors
Recent
2000 character limit reached

Triple Attention Network architecture for MovieQA (2111.09531v1)

Published 18 Nov 2021 in cs.MM

Abstract: Movie question answering, or MovieQA is a multimedia related task wherein one is provided with a video, the subtitle information, a question and candidate answers for it. The task is to predict the correct answer for the question using the components of the multimedia - namely video/images, audio and text. Traditionally, MovieQA is done using the image and text component of the multimedia. In this paper, we propose a novel network with triple-attention architecture for the inclusion of audio in the Movie QA task. This architecture is fashioned after a traditional dual attention network focused only on video and text. Experiments show that the inclusion of audio using the triple-attention network results provides complementary information for Movie QA task which is not captured by visual or textual component in the data. Experiments with a wide range of audio features show that using such a network can indeed improve MovieQA performance by about 7% relative to just using only visual features.

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.