Contextual Emotion Recognition using Large Vision Language Models (2405.08992v1)
Abstract: "How does the person in the bounding box feel?" Achieving human-level recognition of the apparent emotion of a person in real world situations remains an unsolved task in computer vision. Facial expressions are not enough: body pose, contextual knowledge, and commonsense reasoning all contribute to how humans perform this emotional theory of mind task. In this paper, we examine two major approaches enabled by recent large vision LLMs: 1) image captioning followed by a language-only LLM, and 2) vision LLMs, under zero-shot and fine-tuned setups. We evaluate the methods on the Emotions in Context (EMOTIC) dataset and demonstrate that a vision LLM, fine-tuned even on a small dataset, can significantly outperform traditional baselines. The results of this work aim to help robots and agents perform emotionally sensitive decision-making and interaction in the future.
- M. Pantic and L. J. Rothkrantz, “Expert system for automatic analysis of facial expressions,” Image and Vision Computing, vol. 18, no. 11, pp. 881–905, 2000.
- L. F. Barrett, B. Mesquita, and M. Gendron, “Context in emotion perception,” Current directions in psychological science, vol. 20, no. 5, pp. 286–290, 2011.
- D. Lopez-Paz, “From dependence to causation,” arXiv, 2016.
- A. Mittel and S. Tripathi, “Peri: Part aware emotion recognition in the wild,” in ECCV 2022 Workshops. Springer, 2023, pp. 76–92.
- L. et al., “The role of language in emotion: Predictions from psychological constructionism,” Frontiers in psychology, vol. 6, p. 444, 2015.
- K. A. Lindquist and M. Gendron, “What’s in a word? language constructs emotion perception,” Emotion Review, vol. 5, no. 1, pp. 66–71, 2013.
- S. Keen and S. Keen, “Narrative emotions,” Narrative Form: Revised and Expanded Second Edition, pp. 152–161, 2015.
- OpenAI, “Gpt-4 technical report,” 2023.
- M. D. Resnik, “The context principle in frege’s philosophy,” Philosophy and Phenomenological Research, vol. 27, no. 3, pp. 356–365, 1967.
- Yasaman Etesam (4 papers)
- Chuxuan Zhang (4 papers)
- Angelica Lim (21 papers)
- Özge Nilay Yalçın (3 papers)