GLOVE-EMBEDDED ATTENTION BILSTM NETWORKS FOR ENHANCED MULTICLASSIFICATION OF TWEETS IN CYBERBULLYING DETECTION ON ONLINE CONTENT

Batyrkhan Omarov; Rustam Abdrakhmanov; Aigerim Toktarova

doi:10.37943/22PSRO3633

Authors

Batyrkhan Omarov International Information Technology University, Kazakhstan https://orcid.org/0000-0002-8341-7113
Rustam Abdrakhmanov International University Of Tourism and Hospitality, Kazakhstan https://orcid.org/0000-0002-5508-389X
Aigerim Toktarova Khoja Akhmet Yassawi International Kazakh-Turkish University, Kazakhstan https://orcid.org/0000-0002-6265-9236

DOI:

https://doi.org/10.37943/22PSRO3633

Keywords:

cyberbullying detection, deep learning, natural language processing, GloVe embeddings, BiLSTM networks, self-attention mechanisms, social media

Abstract

This paper offers a neural network method for social media cyberbullying detection and classification. The model uses GloVe-embedded BiLSTM networks with self-attention to recognize language and semantic patterns. The research uses advanced machine learning methods to fight cyberbullying and suggests ways to improve cyberbullying detection systems' precision and ethics. The proposed paradigm addresses several cyberbullying levels and forms, enabling targeted interventions and victim support. GloVe implementations do semantic processing, BiLSTM networks sequentially learn, and self-attention mechanisms focus contextual analysis in the model. Word clouds show the abundance and relevance of phrases across several cyberbullying categories, revealing common themes and vocabulary. Tweet lengths, confusion matrix, training and validation loss and accuracy metrics, and ROC curves included in the dataset. The logistic regression model's ROC curve investigation shows substantial classification performance across multiple categories with AUC values between 0.905 and 0.997. The best model for age categorization has an AUC of 0.997, followed by religion (0.996) and ethnicity (0.993). Gender classification has an AUC of 0.979, whereas cyberbullying and non-cyberbullying have 0.921 and 0.905, respectively. The logistic regression model's ROC curve investigation shows substantial classification performance across multiple categories with AUC values between 0.905 and 0.997. The best model for age categorization has an AUC of 0.997, followed by religion (0.996) and ethnicity (0.993). Gender classification has an AUC of 0.979, whereas cyberbullying and non-cyberbullying have 0.921 and 0.905, respectively. The study encourages AI technology for social good and emphasizes the need to improve categorization algorithms to handle cyberbullying language's complex changes. Expanding training datasets, exploring hybrid modeling methodologies, and creating AI application ethics must be future goals.

Author Biographies

Batyrkhan Omarov, International Information Technology University, Kazakhstan

PhD in Information System, Associate Professor, Department of Information System

Rustam Abdrakhmanov, International University Of Tourism and Hospitality, Kazakhstan

Candidate of Technical Science, Associate Professor, Department of Information Technologies

Aigerim Toktarova, Khoja Akhmet Yassawi International Kazakh-Turkish University, Kazakhstan

Master, Senior lecturer, Department of Computer Engineering

References

Atif, A., Zafar, A., Wasim, M., Waheed, T., Ali, A., Ali, H., & Shah, Z. (2024). Cyberbullying Detection and Abuser Profile Identification on Social Media for Roman Urdu. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3445288

Talpur, K. R., Yuhaniz, S. S., & Amir, N. N. B. (2020). Cyberbullying detection: Current trends and future directions. Journal of Theoretical and Applied Information Technology, 98(16), 3197-3208. https://core.ac.uk/download/pdf/425547762.pdf

Ahmadinejad, M., Shahriar, N., & Fan, L. (2023). Self-Training for Cyberbully Detection: Achieving High Accuracy with a Balanced Multi-Class Dataset (Doctoral dissertation, PhD thesis, Faculty of Graduate Studies and Research, University of Regina). https://www.proquest.com/openview/2e6b484d78e3a1fe0486ec1217dd574c/1?pq-origsite=gscholar&cbl=18750&diss=y

Rao, M. P., Kota, N., Nidumukkala, D., Madoori, M., & Ali, D. (2024, April). Enhancing Online Safety: Cyberbullying Detection with Random Forest Classification. In 2024 10th International Conference on Communication and Signal Processing (ICCSP) (pp. 389-393). IEEE. https://doi.org/10.1109/ICCSP60870.2024.10543598

Kaarthika, R., & Hemamalini, R. (2024, July). Enhancing Cyberbullying Detection Through Keyword Filtering: A Comparative Study of ML and DL Approaches. In 2024 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT) (pp. 1-6). IEEE. https://doi.org/10.1109/IConSCEPT61884.2024.10627823

Saeid, A., Kanojia, D., & Neri, F. (2024, June). Decoding Cyberbullying on Social Media: A Machine Learning Exploration. In 2024 IEEE Conference on Artificial Intelligence (CAI) (pp. 425-428). IEEE. https://doi.org/10.1109/CAI59869.2024.00084

Dharani, M., & Sathya, S. (2024). Deep Learning Algorithms with Adam Optimization for Detecting of Cyberbullying Comments. Nanotechnology Perceptions, 627-639. https://nano-ntp.com/index.php/nano/article/download/746/676/1257

Sultan, D., Omarov, B., Kozhamkulova, Z., Kazbekova, G., Alimzhanova, L., Dautbayeva, A., ... & Abdrakhmanov, R. (2023). A Review of Machine Learning Techniques in Cyberbullying Detection. Computers, Materials & Continua, 74(3). https://doi.org/10.32604/cmc.2023.033682

Kumar, C., Kumar, K. A., Gupta, S., & Sardar, T. H. (2024, March). Cyberbullying detection based on the fusion of DistilBERT and SIMHASH Technique. In 2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications Theme: Healthcare and Internet of Things (AIMLA) (pp. 1-4). IEEE. https://doi.org/10.1109/AIMLA59606.2024.10531427

Hoque, M. N., Chakraborty, P., & Seddiqui, M. H. The Challenges and Approaches during the Detection of Cyberbullying Text for Low-resource Language: A Literature. https://doi.org/10.37936/ecti-cit.2023172.248039

Saranyanath, K. P., Shi, W., & Corriveau, J. P. (2022, September). Cyberbullying Detection using Ensemble Method. In CS & IT Conference Proceedings (Vol. 12, No. 15). CS & IT Conference Proceedings. https://doi.org/10.22215/etd/2022-15070

Sari, T. I., Ardilla, Z. N., Hayatin, N., & Maskat, R. (2022). Abusive comment identification on Indonesian social media data using hybrid deep learning. IAES International Journal of Artificial Intelligence, 11(3), 895-904. https://doi.org/10.11591/ijai.v11.i3.pp895-904

Liu, M. (2023, July). A Creativity Survey of Cyberbullying Classification Based on Social Network Analysis. In Proceedings of the 2nd International Conference on Mathematical Statistics and Economic Analysis, MSEA 2023, May 26–28, 2023, Nanjing, China. https://doi.org/10.4108/eai.26-5-2023.2334259

Bhamidi, M., Nandyala, M., Dayalan, R., Karthik, N., & Vani, V. (2024, February). COOL: Classification of Online Offensive Language Using Machine Learning and Deep Learning. In International Conference on Computational Intelligence in Data Science (pp. 87-97). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-69982-5_7

Mohite, S. S., Attar, V., & Kalamkar, S. (2022, October). Shaming tweets detection on Twitter using Machine learning Algorithms. In 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT) (pp. 1-6). IEEE. https://doi.org/10.1109/GCAT55367.2022.9972100

Ismail, A. A., & Yusoff, M. (2022). An efficient hybrid LSTM-CNN and CNN-LSTM with GloVe for text multi-class sentiment classification in gender violence. International Journal of Advanced Computer Science and Applications, 13(9). https://doi.org/10.14569/IJACSA.2022.0130999

Ibrahim, Y. M., Essameldin, R., & Darwish, S. M. (2024). An Adaptive Hate Speech Detection Approach Using Neutrosophic Neural Networks for Social Media Forensics. Computers, Materials & Continua, 79(1). https://doi.org/10.32604/cmc.2024.047840

Koshiry, A. M. E., Eliwa, E. H. I., Abd El-Hafeez, T., & Omar, A. (2023). Arabic toxic tweet classification: leveraging the arabert model. Big Data and Cognitive Computing, 7(4), 170. https://doi.org/10.3390/bdcc7040170

Sharma, D. K., Singh, B., Agarwal, S., Pachauri, N., Alhussan, A. A., & Abdallah, H. A. (2023). Sarcasm detection over social media platforms using hybrid ensemble model with fuzzy logic. Electronics, 12(4), 937. https://doi.org/10.3390/electronics12040937

Slobodzian, V., Molchanova, M., Kovalchuk, O., Sobko, O., Mazurets, O., Barmak, O., & Krak, I. (2022, September). An Approach Based on the Visualization Model for the Ukrainian Web Content Classification. In 2022 12th International Conference on Advanced Computer Information Technologies (ACIT) (pp. 400-405). IEEE. https://doi.org/10.1109/ACIT54803.2022.9913162

GLOVE-EMBEDDED ATTENTION BILSTM NETWORKS FOR ENHANCED MULTICLASSIFICATION OF TWEETS IN CYBERBULLYING DETECTION ON ONLINE CONTENT

Authors

DOI:

Keywords:

Abstract

Author Biographies

Batyrkhan Omarov, International Information Technology University, Kazakhstan

Rustam Abdrakhmanov, International University Of Tourism and Hospitality, Kazakhstan

Aigerim Toktarova, Khoja Akhmet Yassawi International Kazakh-Turkish University, Kazakhstan

References

Downloads

Published

How to Cite

Issue

Section

License