ICTACT Journals

HYBRID HAAR CASCADE AND CNN+EDL FRAMEWORK FOR ROBUST FACIAL EXPRESSION RECOGNITION IN HUMAN–COMPUTER INTERACTION

ICTACT Journal on Communication Technology ( Volume: 16 , Issue: 3 )

Abstract

Facial Expression Recognition (FER) has emerged as a crucial component in Human–Computer Interaction (HCI), enabling applications in healthcare, education, surveillance, and social robotics. Despite considerable progress, achieving robust FER in unconstrained environments remains challenging due to variations in illumination, pose, occlusion, and intra-class similarity. Conventional approaches relying solely on handcrafted features or deep learning often suffer from redundancy in extracted features, sensitivity to noise, and sub optimal performance on subtle emotions such as fear and disgust. These limitations hinder their deployment in real-world, dynamic HCI scenarios where reliability and generalization are essential. This work proposes a hybrid FER framework that integrates Haar Cascade-based feature localization with a Convolutional Neural Network augmented by Evidential Deep Learning (CNN+EDL). Preprocessing stages include image resizing, grayscale conversion, histogram equalization, Gaussian smoothing, face alignment, and normalization. Haar Cascade is employed to extract primary Regions of Interest (eyes, nose, mouth), reducing computational overhead and focusing learning on salient features. These features are then classified using CNN+EDL, which leverages uncertainty modeling and adaptive optimization to improve classification robustness. Experimental evaluations conducted on the FER2013 dataset demonstrate that the proposed model consistently outperforms conventional CNN, ResNet-34, MobileNet V1, EJH-CNN-BiLSTM, and DCNN-Autoencoder baselines. At 100 epochs, CNN+EDL achieves the highest accuracy (97.1%), precision (95.6%), recall (94.5%), and F1-score (94.9%), surpassing the closest baseline by 3–5%. Emotion-wise performance is also superior, with accuracy values of 96.2% (Happy), 94.1% (Sad), 91.3% (Disgust), 90.2% (Fear), 93.5% (Angry), 95.6% (Surprise), and 94.4% (Neutral). These results highlight the system’s generalization ability, particularly for complex emotions.

Authors

R. Shanthakumari¹, M. Babu², S. Sharavanan³, R. Nithiavathy⁴
Kongu Engineering College, India¹, Karpagam College of Engineering, India^2,4, CMS College of Engineering, India³

Keywords

Facial Expression Recognition, Haar Cascade, Convolutional Neural Network, Evidential Deep Learning, Human–Computer Interaction

Published By

ICTACT

Published In

ICTACT Journal on Communication Technology
( Volume: 16 , Issue: 3 )

Date of Publication

September 2025

Pages

3652 - 3663

Doi

10.21917/ijct.2025.0543

Page Views

284

Article Details ICTACT Journals