Convolutional neural networks and transformers in skin tumor diagnostics: a comparative analysis of the efficiency of artificial intelligence models in computer vision programs

A. I. Lamotkin; D. I. Korabelnikov

doi:10.17749/2070-4909/farmakoekonomika.2025.327

Convolutional neural networks and transformers in skin tumor diagnostics: a comparative analysis of the efficiency of artificial intelligence models in computer vision programs

A. I. Lamotkin, D. I. Korabelnikov

https://doi.org/10.17749/2070-4909/farmakoekonomika.2025.327

Full Text:

PDF (Rus) HTML (Rus) XML (Rus)

Generate QR code

Abstract

Background. When applying computer vision programs in artificial intelligence (AI) models to diagnose skin tumors, melanoma in particular, even minimal classification errors are critical. Vision transformers (ViT), a new model architecture, have shown promising results in computer vision; however, their efficiency in classifying skin lesions has received insufficient research attention. This study is one of the first to directly compare convolutional neural networks (CNNs) and ViT on clinically relevant metrics, which is especially important for the implementation of AI in dermatological and oncological practice.

Objective: To compare the performance of CNNs and ViT in the tasks of binary classification of skin lesions: “melanoma/not melanoma” and “benign/malignant”.

Material and methods. The study used CNN (MobileNetV2, Xception) and ViT architectures. Testing was carried out on independent datasets (3000 and 4800 images, respectively) with assessment by the metrics of accuracy, sensitivity, specificity. Augmentation, class balancing, hyperparameter optimization, and data cleaning from artifacts were used.

Results. ViT showed superiority over CNNs. Thus, in the “melanoma/non-melanoma” task, the accuracy was 92.93% versus 88% for Xception, and in the “benign/malignant” task – 91.35% versus 85%. Transformer model demonstrated better specificity (up to 95%).

Conclusion. ViT provides higher accuracy due to the analysis of global patterns, although requiring careful tuning and high-quality data. CNNs remain a stable solution with limited data. For clinical use, a combination of both architectures is recommended to improve the reliability of diagnostics.

Keywords

artificial intelligence, computer vision, convolutional neural networks, visual transformers, image classification, melanoma, benign neoplasms, malignant neoplasms, diagnosis

About the Authors

A. I. Lamotkin

Moscow Haass Medical and Social Institute; Central Research Institute of Organization and Informatization of Healthcare
Russian Federation

Andrey I. Lamotkin

5 2nd Brestskaya Str., Moscow 123056

11 Dobrolyubov Str., Moscow 127254

D. I. Korabelnikov

Moscow Haass Medical and Social Institute
Russian Federation

Daniil I. Korabelnikov, PhD, Assoc. Prof.

5 2nd Brestskaya Str., Moscow 123056

References

1. Schaumburg F., Berli C. Challenges and proposed solutions for optical reading on point-of-need testing systems. Front Sensors. 2023; 4. https://doi.org/10.3389/fsens.2023.1327240.

2. Visalini S., Kanagavalli R. A comprehensive survey of pneumonia diagnosis: image processing and deep learning advancements. In: 2023 3rd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA). https://doi.org/10.1109/ICIMIA60377.2023.10426403.

3. Prabha S., Gupta S., Pandey S.P. Deep learning for medical image segmentation using convolutional neural networks. In: 2024 International Conference on Optimization Computing and Wireless Communication (ICOCWC). https://doi.org/10.1109/ICOCWC60930.2024.10470841.

4. Das M., Sambodhi P.P., Khare A., Naik S.A. Challenges of medical text and image processing. In: 2022 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC). https://doi.org/10.1109/ASSIC55218.2022.10088402.

5. Choudhury S., Gowri R., Babu Sena P., Dinh-Thuan D. (Eds) Intelligent Communication, Control and Devices Proceedings of ICICCD 2020: Proceedings of ICICCD 2020. https://doi.org/10.1007/978-981-16-1510-8.

6. Lamotkin A.I., Korabelnikov D.I., Lamotkin I.A., et al. Artificial intelligence in healthcare and medicine: the history of key events, its significance for doctors, the level of development in different countries. FARMAKOEKONOMIKA. Sovremennaya farmakoekonomika i farmakoepidemiologiya / FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2024; 17 (2): 243–50 (in Russ.). https://doi.org/10.17749/2070-4909/farmakoekonomika.2024.254.

7. Lamotkin A.I., Korabelnikov D.I., Lamotkin I.A., et al. The accuracy of the preliminary diagnosis of malignant melanocytic skin tumors using the artificial intelligence program “Melanoma Check”. Medical Bulletin of the Main Military Clinical Hospital named after N.N. Burdenko. 2025; 1: 42–51 (in Russ.). https://doi.org/10.53652/2782-1730-2025-6-1-42-51.

8. Zhou Z., Jin Y., Ye H., et al. Classification, detection, and segmentation performance of image-based AI in intracranial aneurysm: a systematic review. BMC Med Imaging. 2024; 24 (1): 164. https://doi.org/10.1186/s12880-024-01347-9.

9. Korabelnikov D.I., Lamotkin A.I. The effectiveness of using artificial intelligence in clinical medicine. FARMAKOEKONOMIKA. Sovremennaya farmakoekonomika i farmakoepidemiologiya / FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2025; 18 (1): 114–24 (in Russ.). https://doi.org/10.17749/2070-4909/farmakoekonomika.2025.287.

10. Alzubaidi L., Zhang J., Humaidi A.J., et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021; 8 (1): 53. https://doi.org/10.1186/s40537-021-00444-8.

11. LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015; 521: 436–44. https://doi.org/10.1038/nature14539.

12. Lamotkin A.I., Korabelnikov D.I., Lamotkin I.A. Preliminary differential diagnosis of benign and malignant tumors from epidermal skin tissue using an artificial intelligence program "Derma Onko Check". Current Problems of Health Care and Medical Statistics. 2025; 2: 223–42 (in Russ.). https://doi.org/10.24412/2312-2935-2025-2-223-242.

13. Lamotkin A.I., Korabelnikov D.I., Olisova O.Yu., Lamotkin I.A. Effectiveness of preliminary differential diagnosis of benign and malignant skin neoplasms using the Derma Onko Check artificial intelligence program. FARMAKOEKONOMIKA. Sovremennaya farmakoekonomika i farmakoepidemiologiya / FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2025; 18 (2): 261–70 (in Russ.). https://doi.org/10.17749/2070-4909/farmakoekonomika.2025.294.

14. Milletari F., Ahmadi S.A., Kroll C., et al. Hough-CNN: deep learning for segmentation of deep brain regions in MRI and ultrasound. Computer Vision Image Underst. 2017; 164: 92–102. https://doi.org/10.48550/arXiv.1601.07014.

15. Yamada M., Saito Y., Imaoka H., et al. Development of a real-time endoscopic image diagnosis support system using deep learning technology in colonoscopy. Sci Rep. 2019; 9 (1): 14465. https://doi.org/10.1038/s41598-019-50567-5.

16. Yadav D., Rathor S. Bone fracture detection and classification using deep learning approach. In: 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC). https://doi.org/10.1109/PARC49193.2020.236611.

17. Rahman T., Chowdhury M.E., Khandakar A., et al. Transfer learning with deep convolutional neural network (CNN) for pneumonia detection using chest X-ray. Appl Sci. 2020; 10 (9): 3233. https://doi.org/10.3390/app10093233.

18. Hamamoto R., Suvarna K., Yamada M., et al. Application of artificial intelligence technology in oncology: towards the establishment of precision medicine. Cancers. 2020; 12 (12): 3532. https://doi.org/10.3390/cancers12123532.

19. Asada K., Kobayashi K., Joutard S., et al. Uncovering prognosisrelated genes and pathways by multi-omics analysis in lung cancer. Biomolecules. 2020; 10: 524. https://doi.org/10.3390/biom10040524.

20. Kobayashi K., Bolatkan A., Shiina S., Hamamoto R. Fully-connected neural networks with reduced parameterization for predicting histological types of lung cancer from somatic mutations. Biomolecules. 2020; 10 (9): 1249. https://doi.org/10.3390/biom10091249.

21. Takahashi S., Asada K., Takasawa K., et al. Predicting deep learning based multi-omics parallel integration survival subtypes in lung cancer using reverse phase protein array data. Biomolecules. 2020; 10 (10): 1460. https://doi.org/10.3390/biom10101460.

22. Takahashi S., Sakaguchi Y., Kouno N., et al. Comparison of vision transformers and convolutional neural networks in medical image analysis: a systematic review. J Med Syst. 2024; 48 (1): 84. https://doi.org/10.1007/s10916-024-02105-8.

23. Selvaraju R.R., Cogswell M., Das A., et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 Proceedings of the IEEE international conference on computer vision. https://doi.org/10.48550/arXiv.1610.02391.

24. Takahashi S., Takahashi M., Kinoshita M., et al. Fine-tuning approach for segmentation of gliomas in brain magnetic resonance images with a machine learning method to normalize image differences among facilities. Cancers. 2021; 13: 1415. https://doi.org/10.3390/cancers13061415.

25. Nam H., Lee H., Park J., et al. Reducing domain gap by reducing style bias. In: 2021 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. https://doi.org/10.48550/arXiv.1910.11645.

26. Yan W., Wang Y., Gu S., et al. The domain shift problem of medical image segmentation and vendor-adaptation by Unet-GAN. In: Medical Image Computing and Computer Assisted Intervention– MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II. https://doi.org/10.48550/arXiv.1910.13681.

27. Barzekar H., Patel Y., Tong L., Yu Z. MultiNet with transformers: a model for cancer diagnosis using images. arXiv:230109007. https://doi.org/10.48550/arXiv.2301.09007.

28. Vaswani A., Shazeer N., Parmar N., et al. Attention is all you need. In: Advances in Neural Information Processing Systems 30 (NIPS 2017). https://doi.org/10.48550/arXiv.1706.03762.

29. Dosovitskiy A., Beyer L., Kolesnikov A., et al. An image is worth 16×16 words: transformers for image recognition at scale. arXiv:201011929. https://doi.org/10.48550/arXiv.2010.11929.

30. Liu Y., Wu Y.H., Sun G., et al. Vision transformers with hierarchical attention. arXiv:210603180. https://doi.org/10.48550/arXiv.2106.03180.

31. Han K., Wang Y., Chen H., et al. A survey on vision transformer. arXiv:2012.12556. https://doi.org/10.48550/arXiv.2012.12556.

32. Hatamizadeh A., Yin H., Heinrich G., et al. In: 2023 Global context vision transformers. arXiv:2206.09959. https://doi.org/10.48550/arXiv.2206.09959.

33. He K., Gan C., Li Z., et al. Transformers in medical image analysis. Intel Med. 2023; 3 (1): 59–78. https://doi.org/10.1016/j.imed.2022.07.002.

34. Stassin S., Corduant V., Mahmoudi S.A., Siebert X. Explainability and evaluation of vision transformers: an in-depth experimental study. Electronics. 2023; 13 (1): 175. https://doi.org/10.3390/electronics13010175.

35. Chetoui M., Akhloufi M.A. Explainable vision transformers and radiomics for COVID-19 detection in chest X-rays. J Clin Med. 2022; 11 (11): 3013. https://doi.org/10.3390/jcm11113013.

36. Dipto S.M., Reza M.T., Rahman M.N.J., et al. An XAI integrated identification system of white blood cell type using variants of vision transformer. In: Proceedings of the Second International Conference on Innovations in Computing Research (ICR’23). https://doi.org/10.1007/978-3-031-35308-6_26.

37. Cao Y.H., Yu H., Wu J. Training vision transformers with only 2040 images. arXiv:2201.10728. https://doi.org/10.48550/arXiv.2201.10728.

38. Lee S.H., Lee S., Song B.C. Vision transformer for small-size datasets. arXiv:211213492. https://doi.org/10.48550/arXiv.2112.13492.

39. Liu Y., Sangineto E., Bi W., et al. Efficient training of visual transformers with small datasets. arXiv:2106.03746. https://doi.org/10.48550/arXiv.2106.03746.

40. Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybernetics. 1980; 36 (4): 193–202. https://doi.org/10.1007/BF00344251.

41. LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Proceedings IEEE. 1998; 86 (11): 2278–324. https://doi.org/10.1109/5.726791.

42. Hamamoto R., Komatsu M., Takasawa K., et al. Epigenetics analysis and integrated analysis of multiomics data, including epigenetic data, using artificial intelligence in the era of precision medicine. Biomolecules. 2020; 10 (1): 62. https://doi.org/10.3390/biom10010062.

43. Himel G.M.S., Islam M.M., Al-Aff K.A., et al. Skin cancer segmentation and classification using vision transformer for automatic analysis in dermatoscopy-based noninvasive digital system. Int J Biomed Imaging. 2024; 2024: 3022192. https://doi.org/10.1155/2024/3022192.

What is already known about thе subject?

► Сonvolutional neural networks (CNNs) are widely used in medical diagnostics; however, these architectures have limitations in interpretability and robustness to domain shifts

► Vision transformers (ViT) demonstrate high performance in image classification tasks

► Class balancing and data augmentation are critical for model quality

What are the new findings?

► This study is one of the first to directly compare CNN and ViT architectures in the task of skin tumor classification using clinically relevant metrics

► ViT showed superiority in accuracy (up to 92.93%) and specificity (up to 95%), particularly in complex diagnostic cases

How might it impact the clinical practice in the foreseeable future?

► ViT implementation will improve the accuracy of melanoma diagnostics and reduce the number of false positive results

► Hybrid solutions (CNN + ViT) can become the standard for automated differential diagnosis of benign and malignant skin lesions, melanoma in particular

► The results highlight the importance of data quality and pre-processing methods for implementing artificial intelligence in medicine

Review

For citations:

Lamotkin A.I., Korabelnikov D.I. Convolutional neural networks and transformers in skin tumor diagnostics: a comparative analysis of the efficiency of artificial intelligence models in computer vision programs. FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology. 2025;18(3):365-375. (In Russ.) https://doi.org/10.17749/2070-4909/farmakoekonomika.2025.327

JATS XML

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

ISSN 2070-4909 (Print)
ISSN 2070-4933 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

FARMAKOEKONOMIKA. Modern Pharmacoeconomics and Pharmacoepidemiology

Convolutional neural networks and transformers in skin tumor diagnostics: a comparative analysis of the efficiency of artificial intelligence models in computer vision programs

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy