machine learning, time series analysis, linear regression, correlation matrix, crop yield


Analysis and improvement of crop productivity is one of the most important areas in precision agriculture in the world, including Kazakhstan. In the context of Kazakhstan, agriculture plays a pivotal role in the economy and sustenance of its population. Accurate forecasting of agricultural yields, therefore, becomes paramount in ensuring food security, optimizing resource utilization, and planning for adverse climatic conditions. In-depth analysis and high-quality forecasts can be achieved using machine learning tools.

This paper embarks on a critical journey to unravel the intricate relationship between weather conditions and agricultural outputs. Utilizing extensive datasets covering a period from 1990 to 2023, the project aims to deploy advanced data analytics and machine learning techniques to enhance the accuracy and predictability of agricultural yield forecasts. At the heart of this endeavor lies the challenge of integrating and analyzing two distinct types of datasets: historical agricultural yield data and detailed daily weather records of North Kazakhstan for 1990-2023. The intricate task involves not only understanding the patterns within each dataset but also deciphering the complex interactions between them. Our primary objective is to develop models that can accurately predict crop yields based on various weather parameters, a crucial aspect for effective agricultural planning and resource allocation. Using the capabilities of statistical and mathematical analysis in machine learning, a Time series analysis of the main weather factors supposedly affecting crop yields was carried out and a correlation matrix between the factors and crops was demonstrated and analyzed.

The study evaluated regression metrics such as Root Mean Squared Error (RMSE) and R2 for Random Forest, Decision Tree, Support Vector Machine (SVM) algorithms. The results indicated that Random Forest generally outperformed the Decision Tree and SVM in terms of predictive accuracy for potato yield forecasting in North Kazakhstan Region. Random Forest Regressor showed the best performance with an R2 =0.97865. The RMSE values ranged from 0.25 to 0.46, indicating relatively low error rates, and the R2 values were generally positive, indicating a good fit of the model to the data.

This paper seeks to address these needs by providing insights and predictive models that can guide farmers, policymakers, and stakeholders in making informed decisions.


M. Ziliani, M. Altaf, B. Aragon, R. Houborg, T. Franz, Y. Lu, J. Sheffield, I. Hoteit, M. McCabe. “Early season prediction of within-field crop yield variability by assimilating CubeSat data into a crop model”. Agric. For. Meteorol., vol. 313, 2022, pp. 108736. 2021.108736

X. Hao, Zh. Xiaohu, Y. Zi, J. Li, Q. Xiaolei, T. Yongchao, Y. Tian, Zh. Yan, C. Weixing. “Machine learning approaches can reduce environmental data requirements for regional yield potential simulation”. Eur. J. Agron, vol. 129., 2021, pp.126335. /j.eja.2021.126335

T. Kusainov, Zh. Zhakupova. “Statisticheskie svoystva i prognozirovanie urozhaynosti zernovyih v severnom zernoseyuschem regione Kazahstana”. Vestnik nauki Kazahskogo agrotehnicheskogo universiteta im.S.Seyfullina (mezhdistsiplinarnyiy), vol.1 (96), 2018, pp.33-40.

Uteulin V., Zhientaev S. “Drivers of Cereal Production Efficiency Improvement in Kazakhstan (The Case of the Kostanay Region)”. J. Ecol. Eng., vol.23(10), 2022, pp.1-10.

Z.H. Khalila, S.M. Abdullaeva. “Neural network for grain yield predicting based multispectral satellite imagery: comparative study”. Procedia Comput. Sci., vol.186, 2021, pp. 269–278. j.procs.2021.04.146

T. Xiaopei, L.Haijun, F. Dongxue, Zh. Wenjie, Ch. Jie, L. Lun, Y. Li. “Prediction of field winter wheat yield using fewer parameters at middle growth stage by linear regression and the BP neural network method”. Eur. J. Agron., vol.141, 2022, pp. 126621.

Leisner, C. P. (2020). Review: Climate change impacts on food security- focus on perennial cropping systems and nutritional value. Plant Science, 293, 110412.

Sharma, A., Jain, A., Gupta, P., & Chowdary, V. (2021). Machine Learning Applications for Precision Agriculture: A Comprehensive Review. IEEE Access, 9, 4843–4873.

Albuquerque, P.C., Cajueiro, D.O., & Rossi, M.D.C. (2022). Machine learning models for forecasting power electricity consumption using a high dimensional dataset. Expert Systems With Applications, 187, 115917.

Elbasi E., Zaki C., Topcu A.E., Abdelbaki W., Zreikat A.I., Cina E., Shdefat A., Saker L. Crop Prediction Model Using Machine Learning Algorithms. Applied Sciences. 2023; 13(16):9288.

Adnan, N., Nordin, S.M., Bahruddin, M.a.B., & Tareq, A.H. (2019). A state-of-the-art review on facilitating sustainable agriculture through green fertilizer technology adoption: Assessing farmers behavior. Trends in Food Science and Technology, 86, 439–452.

Hossain, A., Sabagh, A.E., Barutçular, C., Bhatt, R., Çığ, F., Seydoşoğlu, S., Turan, N., Konuşkan, Ö., Iqbal, M. A., Abdelhamid, M. T., Soler, C. M. T., Laing, A. M., & Saneoka, H. (2020). Sustainable crop production to ensuring food security under climate change: A Mediterranean perspective. Australian Journal of Crop Science, 14(03):2020, 439–446.

Raju, C.M.A., Ashoka, D.V., & Prakash, A. (2023). CropCast: Harvesting the future with interfused machine learning and advanced stacking ensemble for precise crop prediction. Kuwait Journal of Science, 100160.

Zhai, Z., Martínez, J.F., Beltrán, V., & Martínez, N. L. (2020). Decision support systems for agriculture 4.0: Survey and challenges. Computers and Electronics in Agriculture, 170, 105256.

Shi, F., Hao, Z., Zhang, X., & Hao, F. (2021). Changes in climate-crop yield relationships affect risks of crop yield reduction. Agricultural and Forest Meteorology, 304–305, 108401.

Morales, A.G., & Villalobos, F.J. (2023). Using machine learning for crop yield prediction in the past or the future. Frontiers in Plant Science, 14.

Palanivel K., Surianarayanan Ch. An approach for prediction of crop yield using machine learning and big data techniques. Int. J. Comput. Eng. Technol., 10 (2019), pp. 110-18.

Spanaki, K., Sivarajah, U., Fakhimi, M., Despoudi, S., & Irani, Z. (2021). Disruptive technologies in agricultural operations: a systematic review of AI-driven AgriTech research. Annals of Operations Research, 308(1–2), 491–524.

Afzal, S., Shokri, A., Ziapour, B.M., Shakibi, H., & Sobhani, B. (2024). Building energy consumption prediction and optimization using different neural network-assisted models; comparison of different networks and optimization algorithms. Engineering Applications of Artificial Intelligence, 127, 107356.

S. Kujawa, G. Niedbała. “Artificial Neural Networks in Agriculture”, Agric., vol. 11(6), 2021, pp. 496-497. /agriculture11060497

M. Karatayev, M. Clarke, V. Salnikov, R. Bekseitova, M. Nizamova. “Monitoring climate change, drought conditions and wheat production in Eurasia: The case study of Kazakhstan”. Heliyon, vol.8 (1), 2021, pp.e08660.

Tanaka, A., Diagne, M., Saito, K. Causes of yield stagnation in irrigated lowland rice systems in the Senegal River Valley: Application of dichotomous decision tree analysis. Field Crop Res. 2015, 176, 99–107.

Banerjee, H.; Goswami, R.; Chakraborty, S.; Dutta, S.; Majumdar, K.; Satyanarayana, T.; Jat, M.L.; Zingore, S. Understanding biophysical and socio-economic determinants of maize (Zea mays L.) yield variability in eastern India. Njas-Wagen. J. Life Sc. 2022, 70–71, 79–93.

Pang, A.; Chang, M.W.L.; Chen, Y. Evaluation of Random Forests (RF) for Regional and Local-Scale Wheat Yield Prediction in Southeast Australia. Sensors 2022, 22, 717.




How to Cite

Mimenbayeva, A., Issakova, G., Tanykpayeva , B., Tursumbayeva, A. ., Suleimenova, R. . ., & Tulkibaev, A. . (2024). APPLYING MACHINE LEARNING FOR ANALYSIS AND FORECASTING OF AGRICULTURAL CROP YIELDS. Scientific Journal of Astana IT University, 17(17), 28–42.



Information Technologies