Emergent Mind

Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers

(2403.11440)

Published Mar 18, 2024 in cs.CV

Abstract

Human emotion recognition holds a pivotal role in facilitating seamless human-computer interaction. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advocates a novel approach aimed at refining continuous emotion recognition. We achieve this by initially harnessing pre-training with Masked Autoencoders (MAE) on facial datasets, followed by fine-tuning on the aff-wild2 dataset annotated with expression (Expr) labels. The pre-trained model serves as an adept visual feature extractor, thereby enhancing the model's robustness. Furthermore, we bolster the performance of continuous emotion recognition by integrating Temporal Convolutional Network (TCN) modules and Transformer Encoder modules into our framework.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.