Optimizing Survival Analysis of XGBoost for Ties to Predict Disease Progression of Breast Cancer

Optimizing Survival Analysis of XGBoost for Ties to Predict Disease Progression of Breast Cancer

Abstract

Objective: Some excellent prognostic models based on survival analysis methods for breast cancer have been proposed and extensively validated, which provide an essential means for clinical diagnosis and treatment to improve patient survival. To analyze clinical and follow-up data of 12119 breast cancer patients, derived from the Clinical Research Center for Breast (CRCB) in West China Hospital of Sichuan University, we developed a gradient boosting algorithm, called EXSA, by optimizing survival analysis of XGBoost framework for ties to predict the disease progression of breast cancer.

Methods: EXSA is based on the XGBoost framework in machine learning and the Cox proportional hazards model in survival analysis. By taking Efron approximation of partial likelihood function as a learning objective for ties, EXSA derives gradient formulas of a more precise approximation. It optimizes and enhances the ability of XGBoost for survival data with ties. After retaining 4575 patients (3202 cases for training, 1373 cases for test), we exploit the developed EXSA method to build an excellent prognostic model to estimate disease progress. Risk score of disease progress is evaluated by the model, and the risk grouping and continuous functions between risk scores and disease progress rate at 5- and 10-year are also demonstrated