State of the Art in the Application of Multimodal Affective Methods for Comparative Analysis of Modal Deficits

Yanhaotian Zhao

doi:10.54254/2755-2721/2025.PO24716

Applied and Computational EngineeringOpen access

State of the Art in the Application of Multimodal Affective Methods for Comparative Analysis of Modal Deficits

Research Article

Open Access

State of the Art in the Application of Multimodal Affective Methods for Comparative Analysis of Modal Deficits

Yanhaotian Zhao ^1*

¹ School of Information Science & Engineering, Lanzhou University, Lanzhou, Gansu, 730000, China

^*Corresponding author: zhaoyht21@lzu.edu.cn

Published on 4 July 2025

ACE Vol.174

ISSN (Print): 2755-273X

ISSN (Online): 2755-2721

ISBN (Print): 978-1-80590-235-5

ISBN (Online): 978-1-80590-236-2

Download Cover

Abstract

The advent of multimedia technology has precipitated a paradigm shift in the realm of human-computer interaction and affective computing, thus rendering multimodal emotion recognition a pivotal domain. However, the issue of modal absence, resulting from equipment failure or environmental interference in practical applications, significantly impacts the accuracy of emotion recognition. The objective of this paper is to analyse multimodal emotion recognition methods oriented to modal absence. The focus is on comparing and analysing the advantages and disadvantages of techniques such as generative class and joint representation class. Experimental findings demonstrate the efficacy of these methods in surpassing the conventional baseline on diverse datasets, including IEMOCAP, CMU-MOSI, and others. Notably, CIF-MMIN enhances the mean accuracy by 0.92% in missing conditions while concurrently reducing the UniMF parameter by 30%, thus preserving the SOTA performance. Key challenges currently being faced by researchers in the field of multimodal emotion recognition for modal absence include cross-modal dependencies and semantic consistency, model generalisation ability, and dynamic scene adaptation. These challenges may be addressed in the future through the development of a lightweight solution that does not require full-modal pre-training, and by combining comparative learning with generative modelling to enhance semantic fidelity. The present paper provides both theoretical support and practical guidance for the development of a highly robust and efficient emotion recognition system.

Keywords:

Multimodal emotion recognition, modal absence, robustness, cross-modal imagery

View PDF

References

[1]. Zhao, Jinming, Ruichen Li, and Qin Jin. "Missing modality imagination network for emotion recognition with uncertain missing modalities." Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

[2]. Wang, Yuanzhi, Yong Li, and Zhen Cui. "Incomplete multimodality-diffused emotion recognition." Advances in Neural Information Processing Systems 36 (2023): 17117-17128.

[3]. Huan, Ruohong, et al. "Unimf: A unified multimodal framework for multimodal sentiment analysis in missing modalities and unaligned multimodal sequences." IEEE Transactions on Multimedia 26 (2023): 5753-5768.

[4]. Liu, Rui, et al. "Contrastive learning based modality-invariant feature acquisition for robust multimodal emotion recognition with missing modalities." IEEE Transactions on Affective Computing 15.4 (2024): 1856-1873.

[5]. Zeng, Jiandian, Tianyi Liu, and Jiantao Zhou. "Tag-assisted multimodal sentiment analysis under uncertain missing modalities." Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022.

[6]. Zeng, Jiandian, Jiantao Zhou, and Tianyi Liu. "Robust multimodal sentiment analysis via tag encoding of uncertain missing modalities." IEEE Transactions on Multimedia 25 (2022): 6301-6314.

[7]. Fu, Fangze, et al. "SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for incomplete multimodal learning in conversational emotion recognition." Knowledge-Based Systems 309 (2025): 112825.

[8]. Shi, Piao, et al. "Text-guided Reconstruction Network for Sentiment Analysis with Uncertain Missing Modalities." IEEE Transactions on Affective Computing (2025).

[9]. Zhu, Linan, et al. "Multimodal sentiment analysis with unimodal label generation and modality decomposition." Information Fusion 116 (2025): 102787.

[10]. John, Vijay, and Yasutomo Kawanishi. "Multimodal Cascaded Framework with Multimodal Latent Loss Functions Robust to Missing Modalities." ACM Transactions on Multimedia Computing, Communications and Applications (2025).

References

[2]. Wang, Yuanzhi, Yong Li, and Zhen Cui. "Incomplete multimodality-diffused emotion recognition." Advances in Neural Information Processing Systems 36 (2023): 17117-17128.

[6]. Zeng, Jiandian, Jiantao Zhou, and Tianyi Liu. "Robust multimodal sentiment analysis via tag encoding of uncertain missing modalities." IEEE Transactions on Multimedia 25 (2022): 6301-6314.

[8]. Shi, Piao, et al. "Text-guided Reconstruction Network for Sentiment Analysis with Uncertain Missing Modalities." IEEE Transactions on Affective Computing (2025).

[9]. Zhu, Linan, et al. "Multimodal sentiment analysis with unimodal label generation and modality decomposition." Information Fusion 116 (2025): 102787.

Cite this article

Zhao,Y. (2025). State of the Art in the Application of Multimodal Affective Methods for Comparative Analysis of Modal Deficits. Applied and Computational Engineering,174,1-9.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

About volume

Volume title: Proceedings of CONF-CDS 2025 Symposium: Data Visualization Methods for Evaluatio

ISBN: 978-1-80590-235-5(Print) / 978-1-80590-236-2(Online)

Editor: Marwan Omar, Elisavet Andrikopoulou

Conference website: https://2025.confcds.org/portsmouth.html

Conference date: 30 July 2025

Series: Applied and Computational Engineering

Volume number: Vol.174

ISSN: 2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:

1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.

2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.

3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).