Context-Aware Siamese Networks for Efficient Emotion Recognition in Conversation (2404.11141v1)
Abstract: The advent of deep learning models has made a considerable contribution to the achievement of Emotion Recognition in Conversation (ERC). However, this task still remains an important challenge due to the plurality and subjectivity of human emotions. Previous work on ERC provides predictive models using mostly graph-based conversation representations. In this work, we propose a way to model the conversational context that we incorporate into a metric learning training strategy, with a two-step process. This allows us to perform ERC in a flexible classification scenario and to end up with a lightweight yet efficient model. Using metric learning through a Siamese Network architecture, we achieve 57.71 in macro F1 score for emotion classification in conversation on DailyDialog dataset, which outperforms the related work. This state-of-the-art result is promising regarding the use of metric learning for emotion recognition, yet perfectible compared to the microF1 score obtained.
- How to train your maml. In Seventh International Conference on Learning Representations, ICLR.
- Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations, ICLR 2015, 1409.
- Assessing the accuracy of prediction algorithms for classification: An overview. Bioinformatics (Oxford, England), 16:412ā24.
- Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135ā146.
- Large scale online learning of image similarity through ranking. J. Mach. Learn. Res., 11:1109ā1135.
- Davide Chicco and Giuseppe Jurman. 2020. The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC genomics, 21(1):6.
- Harald CramƩr. 1946. Mathematical Methods of Statistics (PMS-9), Volume 9. Princeton University Press, Princeton.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171ā4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- A theoretically sound upper bound on the triplet loss for improving the efficiency of deep distance metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10404ā10413.
- A unified few-shot classification benchmark to compare transfer and meta learning approaches. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1).
- Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICMLā17, page 1126ā1135. JMLR.org.
- COSMIC: COmmonSense knowledge for eMotion identification in conversations. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2470ā2481, Online. Association for Computational Linguistics.
- Exploring the role of context in utterance-level emotion, act and intent classification in conversations: An empirical study. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1435ā1449, Online. Association for Computational Linguistics.
- DialogueGCN: A graph convolutional neural network for emotion recognition in conversation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 154ā164, Hong Kong, China. Association for Computational Linguistics.
- Few-shot emotion recognition in conversation with sequential prototypical networks. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic.
- An adaptive layer to leverage both domain and task specific information from scarce data. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6):7757ā7765.
- Deep siamese neural networks for facial expression recognition in the wild. IEEE Transactions on Affective Computing, 14(2):1148ā1158.
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, 9(8):1735ā1780.
- Meta-learning in neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):5149ā5169.
- UniMSE: Towards unified multimodal sentiment analysis and emotion recognition. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7837ā7851, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7360ā7370, Online. Association for Computational Linguistics.
- Multi-scale contrastive siamese networks for self-supervised graph representation learning. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pages 1477ā1483. International Joint Conferences on Artificial Intelligence Organization. Main Track.
- MĀ I Jordan. 1986. Serial order: a parallel distributed processing approach. technical report, june 1985-march 1986.
- Supervised contrastive learning. Advances in neural information processing systems, 33:18661ā18673.
- Siamese neural networks for one-shot image recognition.
- Bongseok Lee and YongĀ Suk Choi. 2021. Graph based network with contextualized representations of turns in dialogue. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 443ā455, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Past, present, and future: Conversational emotion recognition through structural modeling of psychological knowledge. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1204ā1214, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- EmoCaps: Emotion capsule based model for conversational emotion recognition. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1610ā1618, Dublin, Ireland. Association for Computational Linguistics.
- S+PAGE: A speaker and position-aware graph neural network model for emotion recognition in conversation. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 148ā157, Online only. Association for Computational Linguistics.
- Roberta: A robustly optimized bert pretraining approach.
- Optimizing millions of hyperparameters by implicit differentiation. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 1540ā1552. PMLR.
- Dialoguernn: An attentive rnn for emotion detection in conversations. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):6818ā6825.
- BrianĀ W. Matthews. 1975. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et biophysica acta, 405 2:442ā51.
- A simple neural attentive meta-learner. In International Conference on Learning Representations.
- Is discourse role important for emotion recognition in conversation? Proceedings of the AAAI Conference on Artificial Intelligence, 36(10):11121ā11129.
- Karl Pearson. 1895. Vii. note on regression and inheritance in the case of two parents. proceedings of the royal society of London, 58(347-352):240ā242.
- The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
- Context-dependent embedding utterance representations for emotion recognition in conversations. In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, pages 228ā236, Toronto, Canada. Association for Computational Linguistics.
- Robert Plutchik. 2001. The Nature of Emotions. American Scientist, 89(4):344.
- Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 873ā883, Vancouver, Canada. Association for Computational Linguistics.
- Sachin Ravi and Hugo Larochelle. 2016. Optimization as a model for few-shot learning. In International Conference on Learning Representations.
- Fuji Ren and Siyuan Xue. 2020. Intention detection based on siamese neural network with triplet loss. IEEE Access, 8:82242ā82254.
- Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science.
- Facenet: A unified embedding for face recognition and clustering. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 815ā823.
- Matthew Schultz and Thorsten Joachims. 2003. Learning a distance metric from relative comparisons. In Advances in Neural Information Processing Systems, volumeĀ 16. MIT Press.
- Directed acyclic graph network for conversational emotion recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1551ā1560, Online. Association for Computational Linguistics.
- Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPSā17, page 4080ā4090, Red Hook, NY, USA. Curran Associates Inc.
- Mpnet: Masked and permuted pre-training for language understanding. In Advances in Neural Information Processing Systems, volumeĀ 33, pages 16857ā16867. Curran Associates, Inc.
- Supervised prototypical contrastive learning for emotion recognition in conversation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5197ā5206, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Aarohi Srivastava etĀ al. 2022. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615.
- Learning to compare: Relation network for few-shot learning. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1199ā1208.
- Llama 2: Open foundation and fine-tuned chat models.
- Context- and sentiment-aware networks for emotion recognition in conversation. IEEE Transactions on Artificial Intelligence, 3(5):699ā708.
- Attention is all you need.
- Matching networks for one shot learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPSā16, page 3637ā3645, Red Hook, NY, USA. Curran Associates Inc.
- Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers. In Advances in Neural Information Processing Systems, volumeĀ 33, pages 5776ā5788. Curran Associates, Inc.
- DualGATs: Dual graph attention networks for emotion recognition in conversations. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7395ā7408, Toronto, Canada. Association for Computational Linguistics.
- Cauain: Causal aware interaction network for emotion recognition in conversations. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 4524ā4530. International Joint Conferences on Artificial Intelligence Organization. Main Track.
- Knowledge-enriched transformer for emotion detection in textual conversations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 165ā176, Hong Kong, China. Association for Computational Linguistics.
- Topic-driven and knowledge-aware transformer for dialogue emotion detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1571ā1582, Online. Association for Computational Linguistics.
- Iemocap: interactive emotional dyadic motion capture database. Language Resources and Evaluation, 42(4):335ā359.
- DailyDialog: A manually labelled multi-turn dialogue dataset. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 986ā995, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- MELD: A multimodal multi-party dataset for emotion recognition in conversations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 527ā536, Florence, Italy. Association for Computational Linguistics.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.