Analysis of the Current Status of Research on Multimodal Human Behavior Recognition Based on Perceptual Modalities
Research Article
Open Access
CC BY

Analysis of the Current Status of Research on Multimodal Human Behavior Recognition Based on Perceptual Modalities

Chakto Lai 1*
1 Aquinas International School, Ontario, CA, 91761, USA
*Corresponding author: charlielai2006@yahoo.com
Published on 10 July 2025
Journal Cover
ACE Vol.174
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-80590-235-5
ISBN (Online): 978-1-80590-236-2
Download Cover

Abstract

In complex real environments, traditional human behavior recognition methods are easily affected by factors such as lighting changes, occlusion, and background interference, resulting in incomplete perception information and insufficient robustness, making it difficult to meet the stable recognition requirements in scenarios such as intelligent security and medical monitoring. Multimodal perception mechanisms can achieve multidimensional modeling and feature complementarity of behavior information by fusing multi-source perception modalities such as RGB, depth, and skeleton, and have attracted significant research attention in recent years. This paper takes "perception modality" as the core of analysis, systematically sorts out the mainstream perception data types in multimodal behavior recognition, analyzes the advantages and limitations of RGB, Depth, and Skeleton modalities, and summarizes their complementary mechanisms and typical application methods in fusion combinations. This study provides theoretical support for the modality selection and fusion strategy design of behavior recognition systems from the perspective of perception modality and has important research and application value.

Keywords:

Multimodal perception, human behavior recognition, perception modality, RGB/Depth/Skeleton

View PDF
Lai,C. (2025). Analysis of the Current Status of Research on Multimodal Human Behavior Recognition Based on Perceptual Modalities. Applied and Computational Engineering,174,69-76.

References

[1]. C. Bian, W. Lü, and W. Feng, “A review and prospect of skeleton-based human action recognition, ” Computer Engineering and Applications, vol. 60, no. 20, pp. 1–29, 2024. (in Chinese)

[2]. Oikawa H, Tsuruda Y, Sano Y, et al. Behavior Recognition in Mice Using RGB-D Videos Captured from Below [C]//2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2024: 4797-4800.

[3]. Ma C, Fan J, Yao J, et al. NPU RGB+ D dataset and a feature-enhanced LSTM-DGCN method for action recognition of basketball players [J]. Applied Sciences, 2021, 11(10): 4426.

[4]. Hu K, Jin J, Zheng F, et al. Overview of behavior recognition based on deep learning [J]. Artificial intelligence review, 2023, 56(3): 1833-1865.

[5]. Shaikh M B, Chai D. RGB-D data-based action recognition: a review [J]. Sensors, 2021, 21(12): 4246.

[6]. Franco A, Magnani A, Maio D. A multimodal approach for human activity recognition based on skeleton and RGB data [J]. Pattern Recognition Letters, 2020, 131: 293-299.

[7]. W. Yan and Y. Yin, “Human action recognition algorithm based on adaptive shifted graph convolutional network with 3D skeleton similarity, ” Computer Science, vol. 51, no. 04, pp. 236–242, 2024. (in Chinese)

[8]. T. Li, D. Qiu, J. Liu, et al., “A survey of human action recognition based on RGB and skeletal data, ” Computer Engineering and Applications, vol. 61, no. 08, pp. 62–82, 2025. (in Chinese)

[9]. Wang C, Yan J. A comprehensive survey of rgb-based and skeleton-based human action recognition [J]. IEEE Access, 2023, 11: 53880-53898.

[10]. Ardabili B R, Pazho A D, Noghre G A, et al. Understanding policy and technical aspects of ai-enabled smart video surveillance to address public safety [J]. Computational Urban Science, 2023, 3(1): 21.

Cite this article

Lai,C. (2025). Analysis of the Current Status of Research on Multimodal Human Behavior Recognition Based on Perceptual Modalities. Applied and Computational Engineering,174,69-76.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

About volume

Volume title: Proceedings of CONF-CDS 2025 Symposium: Data Visualization Methods for Evaluatio

ISBN: 978-1-80590-235-5(Print) / 978-1-80590-236-2(Online)
Editor: Marwan Omar, Elisavet Andrikopoulou
Conference date: 30 July 2025
Series: Applied and Computational Engineering
Volume number: Vol.174
ISSN: 2755-2721(Print) / 2755-273X(Online)