Multi-Modal Deep Learning for Human Activity Recognition: Fusing Skin Texture, Pose Estimation, and Motion Dynamics
Author(s): Dr.D.Seethalakshmi, Dr.R.Arunadevi
Authors Affiliations:
1 Assistant Professor, Anna Adarsh College for Women, Chennai, India
2 Principal, Vidhya Sagar Women’s College, Chengalpet, India
DOIs:10.2015/IJIRMF/202507027     |     Paper ID: IJIRMF202507027Abstract:Â With uses in sports analytics, security, healthcare, and human-computer interaction, Human Activity Recognition (HAR) is growing in popularity in computer vision. Traditional HAR methods often rely solely on motion analysis or pose estimation, which limits their robustness in complex scenarios. This research proposes a multi-modal deep learning system that integrates pose estimation, motion dynamics, and skin texture analysis to enhance HAR accuracy. The suggested model successfully captures temporal and spatial dependencies in human motions by utilizing transformer-based topologies and convolutional neural networks (CNNs). Additionally, it incorporates skin related data for enhanced subject tracking and segmentation. To merge derived characteristics from several modalities and ensure resilience against occlusions, changing lighting conditions, and a range of skin tones, a unique fusion process is presented. The proposed approach outperforms the state-of-the-art methods on benchmark HAR datasets. The results reveal that incorporating skin aware features significantly improves activity recognition in real world scenarios, highlighting the potential of multi-modal deep learning in creating HAR applications.Â
  ÂDr.D.Seethalakshmi, Dr.R.Arunadevi(2025); Multi-Modal Deep Learning for Human Activity Recognition: Fusing Skin Texture, Pose Estimation, and Motion Dynamics, International Journal for Innovative Research in Multidisciplinary Field, ISSN(O): 2455-0620, Vol-11, Issue-7, Pp.188-194.     Available on –  https://www.ijirmf.com/