CLASSIFICATION OF HUMAN EMOTIONS USING THERMOGRAMS AND NEURAL NETWORK

Evan Yershov; Madiyar Nurgaliyev; Gulbakhar Dosymbetova; Batyrbek Zholamanov; Sayat Orynbassar; Tomiris Khumarbekkyzy

doi:10.37943/22GEBT9085

Authors

Evan Yershov Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0009-0006-2267-0365
Madiyar Nurgaliyev Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0000-0002-6795-5384
Gulbakhar Dosymbetova Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0000-0002-3935-7213
Batyrbek Zholamanov Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0000-0001-8206-7425
Sayat Orynbassar Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0009-0001-9124-2560
Tomiris Khumarbekkyzy Al-Farabi Kazakh National University, Kazakhstan https://orcid.org/0009-0005-4945-6273

DOI:

https://doi.org/10.37943/22GEBT9085

Keywords:

neural network, convolution neural network, thermal imager, emotion recognition, inception, U-net, Quadruplet Network, Squeeze net

Abstract

As information systems and technologies continue to evolve, there remains a noticeable gap in the efficiency and practicality of data processing algorithms, especially in the field of emotion recognition. This study explores several neural network models designed to classify emotions based on thermal images (thermograms). The dataset used for training included 1,642 images, some of which were generated through augmentation, with all images captured while participants viewed emotionally charged videos. The goal was to recognize six basic emotions: joy, sadness, fear, disgust, anger, and surprise. To identify the most effective architecture, the performance of five models were compared: a standard convolutional neural network (CNN), Quadruplet Network, U-Net, Inception, and SqueezeNet. Each model was trained on the same dataset under consistent conditions. Classification accuracy and validation loss were the main evaluation metrics. In addition, data augmentation and early stopping were applied to improve generalization and prevent overfitting. Among the tested architectures, the Inception model achieved the highest test accuracy of 97.5%, while the Quadruplet Network achieved 96.85% accuracy with a lower validation loss of 0.571, indicating stronger generalization. These results suggest that both models are well-suited for real-time emotion recognition using thermal imaging. The findings highlight the potential of combining infrared data with modern neural architectures to advance emotion detection systems beyond traditional RGB-based methods.

Author Biographies

Evan Yershov, Al-Farabi Kazakh National University, Kazakhstan

Bachelor student, Faculty of Physics and Technology

Madiyar Nurgaliyev, Al-Farabi Kazakh National University, Kazakhstan

PhD, Faculty of Physics and Technology

Gulbakhar Dosymbetova, Al-Farabi Kazakh National University, Kazakhstan

PhD, Faculty of Physics and Technology

Batyrbek Zholamanov, Al-Farabi Kazakh National University, Kazakhstan

PhD student, Faculty of Physics and Technology

Sayat Orynbassar, Al-Farabi Kazakh National University, Kazakhstan

PhD student, Faculty of Physics and Technology

Tomiris Khumarbekkyzy, Al-Farabi Kazakh National University, Kazakhstan

Master student, Faculty of Physics and Technology

References

Abiodun, O. I., Jantan, A., Omolara, A. E., Dada, K. V., Mohamed, N. A., & Arshad, H. (2018). State-of-the-art in artificial neural network applications: A survey. Heliyon, 4(11). https://doi.org/10.1016/j.heliyon.2018.e00938

Cao, W., Wang, X., Ming, Z., & Gao, J. (2018). A review on neural networks with random weights. Neurocomputing, 275, 278-287. https://doi.org/10.1016/j.neucom.2017.08.040

Tsantekidis, A., Passalis, N., & Tefas, A. (2022). Recurrent neural networks. In Deep learning for robot perception and cognition (pp. 101-115). Academic Press. https://doi.org/10.1016/B978-0-32-385787-1.00010-5

Ketkar, N., Moolayil, J., Ketkar, N., & Moolayil, J. (2020). Deep Learning with Python: Learn Best Practices of Deep Learning Models with PyTorch. https://doi.org/10.1007/978-1-4842-5364-9

Nematollahi, J., & Firoozabadi, M. (2017, November). Recognition of Positive, Negative and Neutral Emotions Using Brain Connectivity Patterns. In 2017 24th National and 2nd International Iranian Conference on Biomedical Engineering (ICBME) (pp. 330-333). IEEE. https://doi.org/10.1109/ICBME.2017.8430281

Wang, Z., Ho, S. B., & Cambria, E. (2020). A review of emotion sensing: categorization models and algorithms. Multimedia Tools and Applications, 79, 35553-35582. https://doi.org/10.1007/s11042-019-08328-z

Yaseliani, M., Hamadani, A. Z., Maghsoodi, A. I., & Mosavi, A. (2022). Pneumonia detection proposing a hybrid deep convolutional neural network based on two parallel visual geometry group architectures and machine learning classifiers. IEEE access, 10, 62110-62128. https://doi.org/10.1109/ACCESS.2022.3182498

Anand, R., Shanthi, T., Nithish, M. S., & Lakshman, S. (2020). Face recognition and classification using GoogleNET architecture. In Soft Computing for Problem Solving: SocProS 2018, Volume 1 (pp. 261-269). Springer Singapore. https://doi.org/10.1007/978-981-15-0035-0_20

Peng, S., Huang, H., Chen, W., Zhang, L., & Fang, W. (2020). More trainable inception-ResNet for face recognition. Neurocomputing, 411, 9-19. https://doi.org/10.1016/j.neucom.2020.05.022

Li, B. (2022). Facial expression recognition by DenseNet-121. In Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems (pp. 263-276). Academic Press. https://doi.org/10.1016/B978-0-323-90032-4.00019-5

Li, R. (2023, June). Face detection and recognition technology based on EfficientNet and BNNeck. In International Conference on Mathematics, Modeling, and Computer Science (MMCS2022) (Vol. 12625, pp. 485-490). SPIE. https://doi.org/10.1117/12.2670429

Harakannanavar, S. S., Prashanth, C. R., Raja, K. B., & Madiwalar, C. T. (2018, May). Face Recognition based on the fusion of Bit-Plane and Binary Image Compression Techniques. In 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) (pp. 1889-1894). IEEE. https://doi.org/10.1109/RTEICT42901.2018.9012230

Xue, F., Wang, Q., Tan, Z., Ma, Z., & Guo, G. (2022). Vision transformer with attentive pooling for robust facial expression recognition. IEEE Transactions on Affective Computing, 14(4), 3244-3256. https://doi.org/10.1109/TAFFC.2022.3226473

Pham, H., Dai, Z., Xie, Q., & Le, Q. V. (2021). Meta pseudo labels. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11557-11568). https://openaccess.thecvf.com/content/CVPR2021/papers/Pham_Meta_Pseudo_Labels_CVPR_2021_paper.pdf

Qin, L., Wang, M., Deng, C., Wang, K., Chen, X., Hu, J., & Deng, W. (2023). SwinFace: a multi-task transformer for face recognition, expression recognition, age estimation and attribute estimation. IEEE Transactions on Circuits and Systems for Video Technology. https://doi.org/10.1109/TCSVT.2023.3304724

Grd, P., Tomičić, I., & Barčić, E. (2024). Transfer Learning with EfficientNetV2S for Automatic Face Shape Classification. Journal of Universal Computer Science (JUCS), 30(2). https://doi.org/10.3897/jucs.104490

Hoo, S. C., Ibrahim, H., & Suandi, S. A. (2022). ConvFaceNeXt: Lightweight networks for face recognition. Mathematics, 10(19), 3592. https://doi.org/10.3390/math10193592

Osco, L. P., Wu, Q., De Lemos, E. L., Gonçalves, W. N., Ramos, A. P. M., Li, J., & Junior, J. M. (2023). The segment anything model (sam) for remote sensing applications: From zero to one shot. International Journal of Applied Earth Observation and Geoinformation, 124, 103540. https://doi.org/10.1016/j.jag.2023.103540

Kumar, C. R., Saranya, N., Priyadharshini, M., & Gilchrist, D. (2023). Face recognition using CNN and siamese network. Measurement: Sensors, 27, 100800. https://doi.org/10.1016/j.measen.2023.100800

Ren, G., Lu, X., & Li, Y. (2021). Joint face retrieval system based on a new quadruplet network in videos of multi-camera. IEEE Access, 9, 56709-56725. https://doi.org/10.1109/ACCESS.2021.3072055

Chatterjee, S., & Chu, W. T. (2019, December). Thermal face recognition based on transformation by residual U-net and pixel shuffle upsampling. In International Conference on Multimedia Modeling (pp. 679-689). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-37731-1_55

Kwon, D. H., & Yu, J. M. (2024). Real-time Multi-CNN-based Emotion Recognition System for Evaluating Museum Visitors’ Satisfaction. ACM Journal on Computing and Cultural Heritage, 17(1), 1-18. https://doi.org/10.1145/363112

Sangamesh, H., Viswanatha, V. M., Petli, V., & Patil, N. B. (2023, February). A Novel Approach for Recognition of Face by Using Squeezenet Pre-Trained Network. In 2023 IEEE International Conference on Integrated Circuits and Communication Systems (ICICACS) (pp. 1-5). IEEE. https://doi.org/10.1109/ICICACS57338.2023.10100097

CLASSIFICATION OF HUMAN EMOTIONS USING THERMOGRAMS AND NEURAL NETWORK

Authors

DOI:

Keywords:

Abstract

Author Biographies

Evan Yershov, Al-Farabi Kazakh National University, Kazakhstan

Madiyar Nurgaliyev, Al-Farabi Kazakh National University, Kazakhstan

Gulbakhar Dosymbetova, Al-Farabi Kazakh National University, Kazakhstan

Batyrbek Zholamanov, Al-Farabi Kazakh National University, Kazakhstan

Sayat Orynbassar, Al-Farabi Kazakh National University, Kazakhstan

Tomiris Khumarbekkyzy, Al-Farabi Kazakh National University, Kazakhstan

References

Downloads

Published

How to Cite

Issue

Section

License