COMPARATIVE ANALYSIS OF MACHINE LEARNING ALGORTMS TO IDENTIFY EXTREMIST TEXTS IN THE KAZAKH LANGUAGE
DOI:
https://doi.org/10.37943/14DKRN4681Keywords:
machine learning model, classification, extremist text.Abstract
The article explores various models and methods employed in classifying text content with the aim of identifying destructive information within social networks. The study focuses on utilizing machine learning techniques, such as support vector machines, naive Bayes classifiers, random tree methods, decision tree, k-Nearest Neighbors algorithm, logistic regression, gradient boosting to identify extremist texts. The research findings showcase the effectiveness of these methodologies in the identification process.
The article also offers an overview of existing research, methodologies, and software products in the analysis of extremist texts, emphasizing the importance of case-based learning, deductive learning models, and automated data collection and analysis. Additionally, the article provides an overview of existing research, methods, and software products within the field of analyzing extremist texts. It highlights the significance of case-based learning and the use of deductive learning models, as well as automated data collection and analysis techniques. These approaches contribute to the overall understanding and detection of extremist content.
The article further discusses the relevance and future prospects of the presented research. It emphasizes the need to expand the corpus of documents studied, enabling a more comprehensive analysis of texts, including those in photo, audio, and video formats. The development of complex models for recognizing hidden extremist propaganda is also identified as a key direction for future work.
By addressing these areas of focus, the research presented in the article aims to advance the field of identifying and combating extremist content within social networks. The incorporation of advanced techniques and technologies is crucial to effectively detect and address the presence of such content in various forms and formats.
References
Machine learning. (2022, January 20). Retrieved from http://www.machinelearning.ru/wiki/index. php?title=%D0%9C%D0%B0%D1%88%D0%B8%D0%BD%D0%BD%D0%BE%D0%B5_%D0%BE% D0%B1%D1%83%D1%87%D0%B5%D0%BD%D0%B8%D0%B5
Arpinar, I.B., Kursuncu, U., & Achilov, D. (2016). Social media analytics to identify and counter Islamist extremism: Systematic detection, evaluation, and challenging of extremist narratives online. In 2016 International Conference on Collaboration Technologies and Systems (pp. 611-612). IEEE. https://doi.org/10.1109/CTS.2016.0113
Liu, B. (2007). Web data mining: Exploring hyperlinks, contents, and usage data. Prentice Hall.
Ul Rehman, Z., Abbas, S., Khan, M. A., Mustafa, G., Fayyaz, H., Hanif, M., ... & Saeed, M. A. (2020). Understanding the language of ISIS: An empirical approach to detect radical content on Twitter using machine learning. Computers, Materials & Continua, 66(2), 1075-1090. https://doi.org/10.32604/cmc.2020.012770
Ahmad, S. , Asghar, M.Z. , Alotaibi, F.M. , & Awan, I. (2019). Detection and classification of social media-based extremist affiliations using sentiment analysis techniques. Human-centric Computing and Information Sciences, 9(24), 1-23. https://doi.org/10.1186/s13673-019-0185-6
Mayur, G., Swati, A., Ketan, K., & Ajith, A. (2022). Multi-ideology multi-class extremism classification using deep learning techniques. IEEE Access. https://doi.org/10.1109/ACCESS.2022.3205744
Mayur, G., Swati, A., Shraddha, P., & Ketan, K. (2021). Online extremism detection: A systematic literature review with emphasis on datasets, classification techniques, validation methods, and tools. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3068313
Asif, M., Ishtiaq, A., Ahmad, H., Aljuaid, H., & Shah, J. (2020). Sentiment analysis of extremism in social media from textual information. Telematics Informat, 48, 101345. https://doi.org/10.1016/j.tele.2020.101345
Klausen, J., Marks, C.E., & Zaman, T. (2018). Finding extremists in online social networks. European Journal of Operational Research, 66(4), 957-976. https://doi.org/10.1287/opre.2018.1719
Ul Rehman, Z., Abbas, S., Khan, M.A., Mustafa, G., Fayyaz, H., Hanif, M., & Saeed, M.A. (2020). Understanding the language of ISIS: An empirical approach to detect radical content on Twitter using machine learning. Computers, Materials & Continua, 66(2), 1075-1090. https://doi.org/10.32604/cmc.2020.012770
Burkov, A. (2020). Machine learning without further ado. Peter.
Swamy, M. N., Hanumanthappa, M., & Jyothi, N. M. (2014). Indian language text representation and categorization using supervised learning algorithm. In 2014 International Conference on Intelligent Computing Applications (pp. 406-410). https://doi.org/10.1109/ICICA.2014.89
Mashechkin, I., Petrovskiy, M., Tsarev, D., & Chikunov, M. (2019). Machine learning methods for detecting and monitoring extremist information on the internet. Programming and Computer Software, 45, 99-115.
Ashraf, N., Rafiq, A., Butt, S., Shehzad, S.M.F., Sidorov, G., & Gelbukh, A. (2022). YouTube-based religious hate speech and extremism detection dataset with machine learning baselines. Journal of Intelligent and Fuzzy Systems, 42(5), 4769-4777.
Neurohive. (2022, February 18). Gradientnyj busting – prosto o slozhnom. [Gradient bousting - simple to complex]. Retrieved from https://neurohive.io/ru/osnovy-data-science/gradientyj-busting/
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Articles are open access under the Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish a manuscript in this journal agree to the following terms:
- The authors reserve the right to authorship of their work and transfer to the journal the right of first publication under the terms of the Creative Commons Attribution License, which allows others to freely distribute the published work with a mandatory link to the the original work and the first publication of the work in this journal.
- Authors have the right to conclude independent additional agreements that relate to the non-exclusive distribution of the work in the form in which it was published by this journal (for example, to post the work in the electronic repository of the institution or publish as part of a monograph), providing the link to the first publication of the work in this journal.
- Other terms stated in the Copyright Agreement.