INTERFACE DESIGN OF AN INTELLIGENT INTERACTIVE LEARNING SYSTEM
DOI:
https://doi.org/10.37943/24VEWA2615Keywords:
multimodal interfaces, speech synthesis, avatar-based system, gestures, face mimics, Kazakh languageAbstract
This study presents the design, implementation, and evaluation of an Intelligent Interactive Learning System that employs multimodal interaction to improve the adaptability, accessibility, and engagement of digital education. Conventional e-learning platforms typically rely on static and text-based resources, which restrict personalization and reduce learner motivation. The proposed system integrates natural language processing, speech synthesis, and avatar-based interfaces to deliver lectures through synchronized speech, gestures, and facial expressions. The system automatically processes uploaded lecture scripts and slide presentations, segmenting and aligning them to generate interactive video lectures. A novel contribution of this work is the incorporation of customized Kazakh-language support, implemented through intonation modeling, dependency parsing, and gesture mapping to enhance inclusivity for underrepresented linguistic communities. The system performance was evaluated using Facebook’s variational inference text-to-speech model. Experimental results demonstrate real-time capability, with an average latency of 25.5 ms, throughput exceeding 4,200 characters per second, and low computational resource requirements. These findings confirm the suitability of the system for deployment in resource-constrained environments without compromising speech quality or responsiveness. Compared with conventional tutoring and static e-learning platforms, the system additionally provides automated assessment generation, multimodal feedback, and accessibility functions such as subtitles and adjustable playback controls. The study contributes a scalable model for intelligent, avatar-based learning that integrates speech synthesis, real-time interaction, and cultural-linguistic inclusivity. Future work will focus on extending personalization through adaptive learner modeling, incorporating affective computing for emotion-sensitive interaction, and enabling interoperability with established learning management systems.
References
Ukenova, A., & Bekmanova, G. (2023). A review of intelligent interactive learning methods. Frontiers in Computer Science, 5, 1141649. https://doi.org/10.3389/fcomp.2023.1141649
Ukenova, A., Bekmanova, G., Zaki, N., Kikimbayev, M., & Altaibek, M. (2025). Assessment and Improvement of Avatar-Based Learning System: From Linguistic Structure Alignment to Sentiment-Driven Expressions. Sensors, 25(6), 1921. https://doi.org/10.3390/s25061921
Bekmanova, G., Ongarbayev, Y., Somzhurek, B., & Mukatayev, N. (2021). Personalized training model for organizing blended and lifelong distance learning courses and its effectiveness in Higher Education. Journal of Computing in Higher Education, 33(3), 668-683. https://doi.org/10.1007/s12528-021-09282-2
Gm, D., Goudar, R. H., Kulkarni, A. A., Rathod, V. N., & Hukkeri, G. S. (2024). A digital recommendation system for personalized learning to enhance online education: A review. IEEE Access, 12, 34019-34041.
Lin, C. C., Huang, A. Y., & Lu, O. H. (2023). Artificial intelligence in intelligent tutoring systems toward sustainable education: a systematic review. Smart Learning Environments, 10(1), 41. https://doi.org/10.1186/s40561-023-00260-y
Dong, J., Mohd Rum, S. N., Kasmiran, K. A., Mohd Aris, T. N., & Mohamed, R. (2022). Artificial intelligence in adaptive and intelligent educational system: a review. Future Internet, 14(9), 245. https://doi.org/10.3390/fi14090245
Alshwaier, A., Youssef, A., & Emam, A. (2012). A new trend for e-learning in KSA using educational clouds. Advanced Computing, 3(1), 81.
Wu, W., & Plakhtii, A. (2021). E-learning based on cloud computing. International Journal of Emerging Technologies in Learning (IJET), 16(10), 4-17.
Shi, H., & Shi, C. (2022). Intelligent interactive English teaching system for engineering education. Advances in Multimedia, 2022(1), 4676776. https://doi.org/10.1155/2022/4676776
Li, X. (2022). Intelligent interactive english teaching discrete data modeling and simulation. Scientific Programming, 2022(1), 3807762. https://doi.org/10.1155/2022/3807762
Rafiq, M. S., Jianshe, X., Arif, M., & Barra, P. (2021). Intelligent query optimization and course recommendation during online lectures in E-learning system. Journal of Ambient Intelligence and Humanized Computing, 12(11), 10375-10394. https://doi.org/10.1007/s12652-020-02834-x
Kaplan, A., & Haenlein, M. (2019). Siri, Siri, in my hand: Who’s the fairest in the land? On the interpretations, illustrations, and implications of artificial intelligence. Business horizons, 62(1), 15-25. https://doi.org/10.1016/j.bushor.2018.08.004
Gao, P., Li, J., & Liu, S. (2021). An introduction to key technology in artificial intelligence and big data driven e-learning and e-education. Mobile Networks and Applications, 26(5), 2123-2126. https://doi.org/10.1007/s11036-021-01777-7
Murtaza, M., Ahmed, Y., Shamsi, J. A., Sherwani, F., & Usman, M. (2022). AI-based personalized e-learning systems: Issues, challenges, and solutions. IEEE access, 10, 81323-81342. doi: 10.1109/ACCESS.2022.3193938
Ouyang, F., Zheng, L., & Jiao, P. (2022). Artificial intelligence in online higher education: A systematic review of empirical research from 2011 to 2020. Education and Information Technologies, 27(6), 7893-7925. https://doi.org/10.1007/s10639-022-10925-9
Huang, A. Y., Lu, O. H., & Yang, S. J. (2023). Effects of artificial Intelligence–Enabled personalized recommendations on learners’ learning engagement, motivation, and outcomes in a flipped classroom. Computers & Education, 194, 104684. https://doi.org/10.1016/j.compedu.2022.104684
Ali, J. K. M., Shamsan, M. A. A., Hezam, T. A., & Mohammed, A. A. (2023). Impact of ChatGPT on learning motivation: teachers and students' voices. Journal of English Studies in Arabia Felix, 2(1), 41-49. https://doi.org/10.56540/jesaf.v2i1.51
Dwivedi, Y. K., Hughes, L., Ismagilova, E., Aarts, G., Coombs, C., Crick, T., ... & Williams, M. D. (2021). Artificial Intelligence (AI): Multidisciplinary perspectives on emerging challenges, opportunities, and agenda for research, practice and policy. International journal of information management, 57, 101994. https://doi.org/10.1016/j.ijinfomgt.2019.08.002
Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., ... & Wright, R. (2023). Opinion Paper:“So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International journal of information management, 71, 102642.
Bag, S., Srivastava, G., Bashir, M. M. A., Kumari, S., Giannakis, M., & Chowdhury, A. H. (2022). Journey of customers in this digital era: Understanding the role of artificial intelligence technologies in user engagement and conversion. Benchmarking: An International Journal, 29(7), 2074-2098. https://doi.org/10.1108/BIJ-07-2021-0415
Alhazmi, A. K., Alhammadi, F., Zain, A. A., Kaed, E., & Ahmed, B. (2023). AI’s role and application in education: Systematic review. Intelligent Sustainable Systems: Selected Papers of WorldS4 2022, Volume 1, 1-14. https://doi.org/10.1007/978-981-19-7660-5_1
Baidoo-Anu, D., & Ansah, L. O. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI, 7(1), 52-62. https://doi.org/10.61969/jai.1337500
Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121-154. https://doi.org/10.1016/j.iotcps.2023.04.003
Kuhail, M. A., Alturki, N., Alramlawi, S., & Alhejori, K. (2023). Interacting with educational chatbots: A systematic review. Education and Information Technologies, 28(1), 973-1018. https://doi.org/10.1007/s10639-022-11177-3
Bhaskaran, S., Marappan, R., & Santhi, B. (2021). Design and analysis of a cluster-based intelligent hybrid recommendation system for e-learning applications. Mathematics, 9(2), 197. https://doi.org/10.3390/math9020197
Soni, V. D. (2019). IOT connected with e-learning. International Journal on Integrated Education, 2(5), 273-277.
Kumar, K., & Al-Besher, A. (2022). IoT enabled e-learning system for higher education. Measurement: Sensors, 24, 100480. https://doi.org/10.1016/j.measen.2022.100480
Farahani, B., Firouzi, F., Chang, V., Badaroglu, M., Constant, N., & Mankodiya, K. (2018). Towards fog-driven IoT eHealth: Promises and challenges of IoT in medicine and healthcare. Future generation computer systems, 78, 659-676. https://doi.org/10.1016/j.future.2017.04.036
Tanty, H., Fernando, C., Valencia, J., & Justin, V. (2022). Critical thinking and problem solving among students. Business Economic, Communication, and Social Sciences Journal (BECOSS), 4(3), 173-180. https://doi.org/10.21512/becossjournal.v4i3.8633
Al-Nuaimi, M. N., & Al-Emran, M. (2021). Learning management systems and technology acceptance models: A systematic review. Education and information technologies, 26(5), 5499-5533. https://doi.org/10.1007/s10639-021-10513-3
Deepika, M., Kavitha, M., Chakravarthy, N. K., Rao, J. S., Reddy, D. M., & Chandra, B. M. (2021, January). A critical study on campus energy monitoring system and role of IoT. In 2021 International Conference on Sustainable Energy and Future Electric Transportation (SEFET) (pp. 1-6). IEEE.
Veluvali, P., & Surisetti, J. (2022). Learning management system for greater learner engagement in higher education—A review. Higher Education for the Future, 9(1), 107-121. https://doi.org/10.1177/23476311211049855
Alrakhawi, H. A., Jamiat, N., & Abu-Naser, S. S. (2023). Intelligent tutoring systems in education: a systematic review of usage, tools, effects and evaluation. Journal of Theoretical and Applied Information Technology, 101(4), 1205-1226.
Chao, Z., Qing, S., & Mingwen, T. (2023). A study of multimodal intelligent adaptive learning system and its pattern of promoting learners’ online learning engagement. Psychol Res, 13(5), 202-6. doi:10.17265/2159–5542/2023.05.002.
Al Omoush, M. H., Salih, S. E., Kishore, S., & Mehigan, T. (2023, November). Interactive multimodal learning: towards using pedagogical agents for inclusive education. In 2023 IEEE International Humanitarian Technology Conference (IHTC) (pp. 1-7). IEEE. doi: 10.1109/IHTC58960.2023.10508848.
Jung, M., Lim, Y., Kim, S., Jang, J. Y., Shin, S., & Lee, K. H. (2022, October). An emotion-based Korean multimodal empathetic dialogue system. In Proceedings of the Second Workshop on When Creative AI Meets Conversational AI (pp. 16-22). https://aclanthology.org/2022.cai-1.3/
Bekmanova, G., Ukenova, A., Omarbekova, A., Zakirova, A., & Kantureyeva, M. (2024, July). Features of the interface of system for solving social problems. In 2024 8th International Conference on Computer, Software and Modeling (ICCSM) (pp. 5-13). IEEE.
Azofeifa J. D. et al. Systematic review of multimodal human–computer interaction //Informatics. – MDPI, 2022. – Т. 9. – №. 1. – С. 13.
Li, W., Yu, J., Zhang, Z., & Liu, X. (2022). Dual coding or cognitive load? Exploring the effect of multimodal input on English as a foreign language learners’ vocabulary learning. Frontiers in Psychology, 13, 834706.
Vasilaki, E., & Mavrogianni, A. (2025). Extending Cognitive Load Theory: The CLAM Framework for Biometric, Adaptive, and Ethical Learning. Psychology International, 7(2), 40.
AlShaikh, R., Al-Malki, N., & Almasre, M. (2024). The implementation of the cognitive theory of multimedia learning in the design and evaluation of an AI educational video assistant utilizing large language models. Heliyon, 10(3).
Worsley, M., Barel, D., Davison, L., Large, T., & Mwiti, T. (2018, June). Multimodal interfaces for inclusive learning. In International Conference on Artificial Intelligence in Education (pp. 389-393). Cham: Springer International Publishing.
Ermakova, T., Fabian, B., Golimblevskaia, E., & Henke, M. (2023). A comparison of commercial sentiment analysis services. SN Computer Science, 4(5), 477.
Yergesh, B., Bekmanova, G., & Sharipbay, A. (2019, February). Sentiment analysis of Kazakh text and their polarity. In Web Intelligence (Vol. 17, No. 1, pp. 9-15). Sage UK: London, England: SAGE Publications. https://doi.org/10.3233/WEB-190396
Huang, X. (2020, February). Construction and application of online course teaching in intelligent learning environment. In The International Conference on Cyber Security Intelligence and Analytics (pp. 702-709). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-43306-2_99
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza: A Python natural language processing toolkit for many human languages. arXiv preprint arXiv:2003.07082. https://doi.org/10.48550/arXiv.2003.07082
Pratap, V., Tjandra, A., Shi, B., Tomasello, P., Babu, A., Kundu, S., ... & Auli, M. (2024). Scaling speech technology to 1,000+ languages. Journal of Machine Learning Research, 25(97), 1-52. https://doi.org/10.48550/arXiv.2305.13516
Kumar, S., Soni, N., & Maurya, A. K. (2025). Multi-model review classification based on sentiments analysis. In Intelligent Computing and Communication Techniques (pp. 73-80). CRC Press. https://doi.org/10.3389/fpsyg.2022.778018
Šturm, P., & Volín, J. (2023). Occurrence and duration of pauses in relation to speech tempo and structural organization in two speech genres. Languages, 8(1), 23. https://doi.org/10.3390/languages8010023
Tan, X., Chen, J., Liu, H., Cong, J., Zhang, C., Liu, Y., ... & Liu, T. Y. (2024). Naturalspeech: End-to-end text-to-speech synthesis with human-level quality. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(6), 4234-4245.
Jeong, C. Y., Song, Y., Shin, S., & Kim, M. (2025). Efficient pitch‐estimation network for edge devices. ETRI Journal, 47(1), 112-122.
Brata, I. P. B. W., & Darmawan, I. D. M. B. A. (2021). Comparative study of pitch detection algorithm to detect traditional Balinese music tones with various raw materials. In Journal of Physics: Conference Series (Vol. 1722, No. 1, p. 012071). IOP Publishing.
Huang, J., Benetos, E., & Ewert, S. (2022, May). Improving lyrics alignment through joint pitch detection. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 451-455). IEEE.
Gujarathi, P., & Patil, S. R. (2021). Review on unit selection-based concatenation approach in text to speech synthesis system. In Cybernetics, Cognition and Machine Learning Applications: Proceedings of ICCCMLA 2020 (pp. 191-202). Singapore: Springer Singapore.
Kaliyev, A., Rybin, S. V., Matveev, Y. N., Kaziyeva, N., & Burambayeva, N. (2018, June). Modeling pause for the synthesis of Kazakh speech. In Proceedings of the Fourth International Conference on Engineering & MIS 2018 (pp. 1-4).
Mussakhojayeva, S., Janaliyeva, A., Mirzakhmetov, A., Khassanov, Y., & Varol, H. A. (2021). KazakhTTS: An open-source Kazakh text-to-speech synthesis dataset. arXiv preprint arXiv:2104.08459.
Prajwal, K. R., Mukhopadhyay, R., Namboodiri, V. P., & Jawahar, C. V. (2020, October). A lip sync expert is all you need for speech to lip generation in the wild. In Proceedings of the 28th ACM international conference on multimedia (pp. 484-492). https://doi.org/10.1145/3394171.3413532
Liu, Z., & Prud’hommeaux, E. (2021, April). Dependency parsing evaluation for low-resource spontaneous speech. In Proceedings of the Second Workshop on Domain Adaptation for NLP (pp. 156-165).
Lind, S. J. (2025). Can AI-powered avatars replace human trainers? An empirical test of synthetic humanlike spokesperson applications. Journal of Workplace Learning, 37(1), 19-40.
Wang, C., & Zou, B. (2025). D‐ID Studio: Empowering Language Teaching With AI Avatars. TESOL Journal, 16(2), e70034.
Logeswari, P., Jebaraj, N. R. S., & BanuPriya, G. (2024). Comparative analysis of AI tools for video production. Journal of Information Technology Review, DLINE Journals.
Cavanagh, T. M., & Kiersch, C. (2023). Using commonly-available technologies to create online multimedia lessons through the application of the Cognitive Theory of Multimedia Learning. Educational technology research and development, 71(3), 1033-1053.
Swenson, A. (2023). Teaching digital identity: opportunities, challenges, and ethical considerations for avatar creation in educational settings. Brazilian Creative Industries Journal, 3(2), 41-58.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Articles are open access under the Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish a manuscript in this journal agree to the following terms:
- The authors reserve the right to authorship of their work and transfer to the journal the right of first publication under the terms of the Creative Commons Attribution License, which allows others to freely distribute the published work with a mandatory link to the the original work and the first publication of the work in this journal.
- Authors have the right to conclude independent additional agreements that relate to the non-exclusive distribution of the work in the form in which it was published by this journal (for example, to post the work in the electronic repository of the institution or publish as part of a monograph), providing the link to the first publication of the work in this journal.
- Other terms stated in the Copyright Agreement.