FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology

Advanced search

COVID-19 pandemic prediction model based on machine learning in selected regions of the Russian Federation

Full Text:


Background. Prediction of the new coronavirus infection (COVID-19) spread is important to take timely measures and initiate systemic preventive and anti-epidemic actions both at the regional and state levels to reduce morbidity and mortality.

Objective: to develop a model for short-term forecasting of COVID-19 cases and deaths in the Russian Federation.

Material and methods. The data for the model training were collected from the Stopcoronavirus.rf and Johns Hopkins University portals. It included 13 features to assess the infection dynamics and mortality, as well as the rate of morbidity and mortality in different countries and certain regions of the Russian Federation. The model was trained by the CatBoost gradient boosting method and retrained daily with updated data.

Results. The forecast model of COVID-19 cases and deaths for the period of up to 14 days was created. The mean absolute percentage error (MAPE) estimate of the model’s accuracy ranged from 2.3% to 24% for 85 regions of the Russian Federation. The advantage of the CatBoost machine learning method over linear regression was shown using the example of the root mean square error (RMSE) value. The model showed less error for regions with a large population than for less populated ones.

Conclusion. The model can be used not only to predict the pandemic of the novel coronavirus infection but also to control and assess the spread of diseases from the group of new infections at their emergence, peak incidence, and stabilization period.

About the Authors

D. V. Gavrilov
Russian Federation

Denis V. Gavrilov – Head of Medical Department

17 premises 62 Varkaus Qy, Republic of Karelia, Petrozavodsk 185031, Russia

R. V. Abramov
Russian Federation

Roman V. Abramov – Data Analyst

17 premises 62 Varkaus Qy, Republic of Karelia, Petrozavodsk 185031, Russia

А. V. Kirilkina
Republican Infectious Diseases Hospital
Russian Federation

Anna V. Kirilkina – Deputy Chief Physician for Medicine

42 Kirov Str., Republic of Karelia, Petrozavodsk 185035, Russia

А. А. Ivshin
Petrozavodsk State University
Russian Federation

Aleksandr A. Ivshin – MD, PhD, Chief of Chair of Obstetrics and Gynecology, Dermatovenerology

Scopus Author ID: 57222275843; RSCI SPIN-code: 8196-6605

33 Lenin Ave., Republic of Karelia, Petrozavodsk 185910, Russia

R. E. Novitskiy
Russian Federation

Roman E. Novitskiy – Director General

17 premises 62 Varkaus Qy, Republic of Karelia, Petrozavodsk 185031, Russia


1. WHO Coronavirus (COVID-19) Dashboard. Available at: (accessed 17.06.2021).

2. Temporary methodological recommendations “Prevention, diagnosis and treatment of a new coronavirus infection (COVID 19). Version 11” (approved by the Ministry of Health of the Russian Federation on May 7, 2021) (in Russ.) Available at: (accessed 17.06.2021).

3. Huang C., Wang Y., Li X., et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020; 395 (10223): 497–506.

4. Onder G., Rezza G., Brusaferro S. Case-fatality rate and characteristics of patients dying in relation to COVID-19 in Italy. JAMA. 2020; 323 (18): 1775–6.

5. Mehta P., McAuley D.F., Brown M., et al. COVID-19: consider cytokine storm syndromes and immunosuppression. Lancet. 2020; 395 (10229): 1033–4.

6. Hollander J.E., Carr B.G. Virtually perfect? Telemedicine for COVID-19. N Engl J Med. 2020; 382 (18): 1679–81.

7. Stenogram of the meeting on the sanitary and epidemiological situation in Russia (April 13, 2020). Available at: (in Russ.) (accessed 17.06.2021).

8. Gusev A.V., Dobridnyuk S.L. Artificial intelligence in medicine and healthcare. Information Society. 2017; 4-5: 78–93 (in Russ.).

9. Gusev A.V., Pliss M.A. The basic recommendations for the creation and development of information systemsin health care based on artificial intelligence. Information Technologies for the Physician. 2018; 3: 45–60 (in Russ.).

10. Gusev A.V., Kuznetsova T.Yu., Korsakov I.N. Artificial intelligence for cardiovascular risks assessment. The Journal of Telemedicine and e-Health. 2018; 3 (8): 85–90 (in Russ.).

11. Gusev A.V., Gavrilov D.V., Korsakov I.N., et al. Prospects for the use of machine learning methods for predicting cardiovascular disease. Information Technologies for the Physician. 2019; 3: 41–7 (in Russ.).

12. Tamm M.V. COVID-19 in Moscow: prognoses and scenarios. FARMAKOEKONOMIKA. Sovremennaya farmakoekonomika i farmako epidemiologiya / FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2020; 13 (1): 43–51 (in Russ.).

13. Fanelli D., Piazzab F. Analysis and forecast of COVID-19 spreading in China, Italy and France. Chaos Solitons Fractals. 2020; 134: 109761.

14. Petropoulos F., Makridakis S. Forecasting the novel coronavirus COVID-19. PLoS One. 2020; 15 (3): e0231236.

15. Ceylan Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci The Total Environ. 2020; 729: 138817.

16. Roda W.C., Varughese M.B., Han D., Li M.Y. Why is it difficult to accurately predict the COVID-19 epidemic? Infect Dis Model. 2020; 5: 271–81.

17. Pandey G., Chaudhary P., Gupta R., Pal S. SEIR and Regression Model based COVID-19 outbreak predictions in India. arXiv:2004.00958.

18. Aleshukina А.V., Denisenko V.V., Aleshukin G.S., Goloshva Е.V. Application of a mathematical model for predicting the epidemiological situation of COVID-19 in the Rostov Region. COVID19-Preprints.Microbe. ru; 2020 (in Russ.).

19. Yang Z., Zeng Z., Wang K., et al. Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions. J Thorac Dis. 2020; 12 (3): 165–74.

20. Petropoulos F., Makridakis S. Forecasting the novel coronavirus COVID-19. PLoS One. 2020; 15 (3): e0231236.

21. Melik-Huseynov D.V., Karyakin N.N., Blagonravova A.S., et al. Regression models predicting the number of deaths from the new coronavirus infection. Modern Technologies in Medicine. 2020; 12 (2): 6–13 (in Russ.).

22. Forecast the global spread of COVID-19. Available at: (accessed 17.06.2021).

23. Prokhorenkova L., Gusev G., Vorobev A., et al. CatBoost: unbiased boosting with categorical features. 32nd Conference on Neural Information Processing Systems. NeurIPS; 2018.

24. StopcoronavirusRF. Available at: https://стопкоронавирус.рф (in Russ.) (accessed 17.06.2021).

25. COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. Available at: (accessed 17.06.2021).

26. Pavlyshenko B.M. Linear, machine learning and probabilistic approaches for time series analysis. The 1st IEEE International Conference on Data Stream Mining & Processing, 23–27 August 2016, Lviv, Ukraine.

27. Ji C., Zoua X., Hu Y., et al. XG-SF: an XGBoost classifier based on shapelet features for time series classification. Procedia Comput Sci. 2019; 147: 24–8.

28. Botchkarev A. Performance metrics (error measures) in machine learning regression, forecasting and prognostics: properties and typology. arXiv:1809.03006.

For citation:

Gavrilov D.V., Abramov R.V., Kirilkina А.V., Ivshin А.А., Novitskiy R.E. COVID-19 pandemic prediction model based on machine learning in selected regions of the Russian Federation. FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2021;14(3):342-356. (In Russ.)

Views: 333

ISSN 2070-4909 (Print)
ISSN 2070-4933 (Online)