GESTURE RECOGNITION OF MACHINE LEARNING AND CONVOLUTIONAL NEURAL NETWORK METHODS FOR KAZAKH SIGN LANGUAGE
Keywords:Hand gesture recognition, neural networks, CNN, LSTM, SVM
Recently, there has been a growing interest in machine learning and neural networks among the public, largely due to advancements in technology which have led to improved methods of computer recognition of objects, sounds, texts, and other data types. As a result, human-computer interactions are becoming more natural and comprehensible to the average person. The progress in computer vision has enabled the use of increasingly sophisticated models for object recognition in images and videos, which can also be applied to recognize hand gestures. In this research, popular hand gesture recognition models, such as the Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Support Vector Machine (SVM) were examined. These models vary in their approaches, processing time, and training data size. The important feature of this research work is the use of various machine learning algorithms and methods such as CNN, LSTM, and SVM. Experiments showed different results when training neural networks for sign language recognition in the Kazakh sign language based on the dactyl alphabet. This article provides a detailed description of each method, their respective purposes, and effectiveness in terms of performance and training. Numerous experimental results were recorded in a table, demonstrating the accuracy of recognizing each gesture. Additionally, specific hand gestures were isolated for testing in front of the camera to recognize the gesture and display the result on the screen. An important feature was the use of mathematical formulas and functions to explain the working principle of the machine learning algorithm, as well as the logical scheme and structure of the LSTM algorithm.
Vidyanova, A. (2022). In the USA, they are interested in the development of Kazakhs for the deaf. Capital. https://kapital.kz/business/105455/v-ssha-zainteresovalis-razrabotkoykazakhstantsev-dlya-glukhikh.html.
Bazarevsky, V., & Fan, Zh. (2019). On-device, real-time hand tracking with mediapipe. Google AI Blog. https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html.
Wang, Y., Wang, H., & He, X. (2020). Sign language recognition based on deep convolutional neural network. IEEE Access, 8, 64990-64999. https://doi.org/10.3390/electronics12040786
Lee, A. R., Cho, Y., Jin, S., & Kim, N. (2020). Enhancement of surgical hand gesture recognition using a capsule network for a contactless interface in the operating room. Computer methods and programs in biomedicine, 190, 105385. https://doi.org/10.1016/j.cmpb.2020.105385
Bilgin, M., & Mutludogan, K. (2019). American Sign Language character recognition with capsule networks. International Symposium on Multidisciplinary Studies and Innovative Technologies, 3. https://doi.org/10.1109/ismsit.2019.8932829
Kudubayeva, S. A., Ryumin, D. A., & Kalzhanov, M. U. (2016). The method of basis vectors for recognition sign language by using sensor KINECT. Journal of Mathematics, Mechanics and Computer Science, 91(3). https://bm.kaznu.kz/index.php/kaznu/article/view/541
Adithya, V., & Reghunadhan, R. (2020). A deep convolutional neural network approach for static hand gesture recognition. Procedia Computer Science, 171, 2353-2361. https://doi.org/10.1016/j.procs.2020.04.255.
Lai, K., & Yanushkevich, S. N. (2018). CNN+RNN depth and skeleton based dynamic hand gesture recognition. International Conference on Pattern Recognition (ICPR), IEEE, 24. https://doi.org/10.1109/ICPR.2018.8545718
Merembayev, T., Kurmangaliyev, D., Bekbauov, B., & Amanbek, Y. (2021). A Comparison of Machine Learning Algorithms in Predicting Lithofacies: Case Studies from Norway and Kazakhstan. Energies, 14(7), 1896. https://doi.org/10.3390/en14071896
Mantecón, T., del Blanco, C.R., Jaureguizar, F., & García, N. (2016) Hand gesture recognition using infrared imagery provided by leap motion controller. Int. Conf. on Advanced Concepts for Intelligent Vision Systems, 47-57, 24-27. https://doi.org/10.1007/978-3-319-48680-2_5
Kumar, A., Thankachan, K., & Dominic, M.M. (2016) Sign language recognition. IEEE international conference on recent advances in information technology (RAIT), 3, 422–428. https://doi.org/10.1109/rait.2016.7507939.
Uskenbayeva, R.K., & Mukhanov, S.B. (2020). Contour analysis of external images. International Conference on Engineering & MIS, 6, 1–6. https://doi.org/10.1145/3410352.3410811
Tan, M., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. International Conference on Machine Learning, 6105-6114. https://arxiv.org/abs/1905.11946
Kenshimov, C., Mukhanov, S., Merembayev, T., & Yedilkhan, D. (2021). A comparison of convolutional neural networks for Kazakh sign language recognition. Eastern-European Journal of EnterpriseTechnologies, 5 (2(113)), 44–54. https://journals.uran.ua/eejet/article/view/241535
How to Cite
Copyright (c) 2023 Articles are open access under the Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish a manuscript in this journal agree to the following terms:
- The authors reserve the right to authorship of their work and transfer to the journal the right of first publication under the terms of the Creative Commons Attribution License, which allows others to freely distribute the published work with a mandatory link to the the original work and the first publication of the work in this journal.
- Authors have the right to conclude independent additional agreements that relate to the non-exclusive distribution of the work in the form in which it was published by this journal (for example, to post the work in the electronic repository of the institution or publish as part of a monograph), providing the link to the first publication of the work in this journal.
- Other terms stated in the Copyright Agreement.