Abstract: |
Cancer, a leading cause of premature death globally, has seen a surge in new cases, projected to reach 28.4 million by 2040. Immunotherapy with immune checkpoint inhibitors (ICIs) like PD-1/PD-L1 inhibitors presents a promising treatment avenue. However, patient response rates vary, prompting the search for predictive biomarkers. Existing markers, often derived from transcriptomic analyses, exhibit moderate accuracy, hindered by cancer heterogeneity and tissue specificity. Artificial intelligence models, classified into regression, classification, and deep learning, have shown promise. Despite their potential, the limitations of current biomarkers require exploring combined predictions with multiple markers, considering various biological mechanisms. In this study, a machine learning model using RNA sequencing data from 546 patients with urothelial, renal, thymic, melanoma, non-small cell carcinoma, and oral cavity carcinoma from nine different cohorts, obtained in public databases, identified 55 genes influencing response classification. The GradientBoosting model demonstrated superior predictive performance compared to previous reports, with an AUC of 0.95, a recall of 0.84, and a specificity of 0.90. Clustering algorithms using SHapley Additive exPlanations values from the model, revealed nine sample groups, each with a majority class and eight of them associated with different types of cancer, demonstrating the potential for agnostic prediction models. |